This description is a summary of our understanding of the job description. Click on ‘Apply’ button to find out more.
Role Description
Mercor is collaborating with a leading AI lab on a short-term project focused on improving preference ranking models for conversational AI systems. We’re seeking detail-oriented generalists—ideally with prior experience in data labeling or content evaluation—to assess and rank model outputs across a variety of domains. This opportunity is well-suited for professionals comfortable with nuanced judgment tasks and working independently in a remote setup.
Key Responsibilities
- Evaluate and compare AI-generated responses based on quality, coherence, and helpfulness
- Assign preference rankings to pairs or sets of model outputs
- Follow detailed labeling guidelines and adjust based on evolving criteria
- Provide brief written explanations for ranking decisions when required
- Flag edge cases or inconsistencies in task design or model output
Qualifications
- Prior experience in data labeling, content moderation, or preference ranking tasks
- Excellent critical thinking and reading comprehension skills
- Comfort working with evolving guidelines and ambiguity
- Strong attention to detail and consistency across repetitive tasks
- Availability for regular part-time work on a weekly basis
Requirements
- Remote and asynchronous — set your own hours
- Expected commitment: 10–20 hours/week
- Flexible workload depending on your availability and performance
Benefits
- $25–35/hour depending on experience and location
- Payments issued weekly via Stripe Connect
- This is a freelance engagement; you’ll be classified as an independent contractor
Application Process
- Submit your resume to get started
- Complete a short form to highlight your relevant experience
- You may be asked to complete a brief assessment to evaluate task fit
- Expect a response within 3–5 business days


