Introduction
The evaluations project at the Alignment Research Center is a new team building capability evaluations (and in the future, alignment evaluations) for advanced ML models. The goals of the project are to improve our understanding of what alignment danger is going to look like, understand how far away we are from dangerous AI, and create metrics that labs can make commitments around (e.g. 'If you hit capability threshold X, don't train a larger model until you've hit alignment threshold Y'). You can learn more about the project at this linked post on the Alignment Forum.
We have filled this position, but we may hire again for it in the future. You can express your interest should we open the role back up, but we will not actively review submissions for now and may not ever get back to you.
Job Mission
- Take ownership of designing and maintaining data pipelines that minimize friction in generating, labeling, quality-checking, filtering, searching and compiling ML datasets covering many aspects of model behavior.
- Take responsibility for the smooth gathering and processing of data at the organization as a whole, and take action to proactively address potential problems.
- Allow researchers to quickly understand properties like quality and balance of the data we've gathered
- Run experiments to discover the most effective prompting or finetuning strategies
- Ensure all results are reliable and easily replicable
- Produce scaling laws and other investigations of how results vary according to model properties
- Keep your eye on the ball of ensuring accurate assessments of model capabilities, be scrappy and willing to move beyond your job description, be part of a fast-moving team.
Example Projects
- Set up an efficient data pipeline and workflow for checking contractor agreement on data labels.
- Set up data pipelines for model behavior trajectories which allow easy searching, filtering and rating.
- Work with different evaluation projects to convert datasets into suitable formats.
- Work with researchers to identify data hypotheses, and set up experiments and visualizations of the data to investigate them
- Work with ML researchers to generate and process data to analyze scaling laws for different model capabilities.
Skills
Essential Skills
- Prior experience with finetuning LLMs or working with natural-language data.
Bonus Skills
- Expertise in particular domains relevant to model takeover, e.g. cybersecurity or ML engineering, in order to help assess model competence on these tasks.
- Design + data-viz skills: Generating novel ideas for improving usability of evaluation tools and providing an interaction experience that unlocks insights about the data or the model
- Experience with large-scale data or distributed systems
- Dealing with the engineering details of large models
This job is already closed and no longer accepting applicants, sorry.