Anywhere
Posted 3 months ago

This description is a summary of our understanding of the job description. Click on ‘Apply’ button to find out more.

Role Description

Mercor is collaborating with a leading AI lab on a short-term project focused on improving preference ranking models for conversational AI systems. We’re seeking detail-oriented generalists—ideally with prior experience in data labeling or content evaluation—to assess and rank model outputs across a variety of domains. This opportunity is well-suited for professionals comfortable with nuanced judgment tasks and working independently in a remote setup.

Key Responsibilities

Evaluate and compare AI-generated responses based on quality, coherence, and helpfulness
Assign preference rankings to pairs or sets of model outputs
Follow detailed labeling guidelines and adjust based on evolving criteria
Provide brief written explanations for ranking decisions when required
Flag edge cases or inconsistencies in task design or model output

Qualifications

Prior experience in data labeling, content moderation, or preference ranking tasks
Excellent critical thinking and reading comprehension skills
Comfort working with evolving guidelines and ambiguity
Strong attention to detail and consistency across repetitive tasks
Availability for regular part-time work on a weekly basis

Requirements

Remote and asynchronous — set your own hours
Expected commitment: 10–20 hours/week
Flexible workload depending on your availability and performance

Benefits

$25–35/hour depending on experience and location
Payments issued weekly via Stripe Connect
This is a freelance engagement; you’ll be classified as an independent contractor

Application Process

Submit your resume to get started
Complete a short form to highlight your relevant experience
You may be asked to complete a brief assessment to evaluate task fit
Expect a response within 3–5 business days

Apply Back to Jobs