gtag('config', 'G-6TW216G7W9', { 'user_id': wix.currentUser.id });
top of page

Main Responsibilities of a Data Linguistic Analyst at Meta

Scott's role as a Data Linguistic Analyst at Meta centers on "red teaming," or attempting to "break" the language model by engaging in unsafe conversations to identify vulnerabilities. This involves interpreting complex policy guidelines, analyzing model responses, and providing high-quality data for a reinforcement learning process, requiring Scott to rapidly build expert knowledge across diverse domains to ensure "garbage in, garbage out" is avoided.

AI Safety, Prompt Engineering, Natural Language Processing, Data Analysis, Ethical Considerations

Advizer Information

Name

Job Title

Company

Undergrad

Grad Programs

Majors

Industries

Job Functions

Traits

Scott N.

Data Linguistic Analyst

Meta

Loyola Marymount University

N/A

Psychology

Technology

Data and Analytics

Video Highlights

1. Scott's role involves direct interaction with a language model, focusing on safety and ethical considerations. He evaluates the model's responses based on policy and identifies unsafe or dangerous conversations.

2. He acts as a prompt engineer, 'red teaming' the language model to find vulnerabilities and improve its safety through reinforcement learning. His work emphasizes data quality to ensure the effectiveness of the learning process.

3. The role requires expertise in linguistics to define and identify 'unsafe' content and necessitates quick research across diverse domains (locales, technologies, social issues) to maintain data quality and relevance.

Transcript

What are your main responsibilities within your current role?

My day-to-day work primarily involves interacting directly with the language model. This means I'm essentially conversing with it and exploring specific areas that are prone to safety issues.

There's a qualitative aspect to this evaluation. I read a policy set and assess the language model's responses based on the context, my intentions as a prompter, and the model's output. This policy set acts as a set of ethical guidelines for the AI, defining what it can and cannot do.

My role focuses solely on safety, not on the general quality of the model. In essence, I act as a prompt engineer for safety purposes. This involves "red teaming," which means trying to intentionally break or "jailbreak" the language model.

I then report these findings to the engineers for a reinforcement learning process. This allows them to use examples of dangerous or unsafe conversations.

There's a lot of jargon involved, particularly around defining terms like "unsafe" and "dangerous." I have to consider these definitions every single day and evaluate every conversation accordingly.

The reinforcement learning process itself is the technical part, which I'm not directly involved in. However, I am responsible for the data that feeds into that process.

In AI, there's a saying: "garbage in, garbage out." While I don't oversee the model's overall quality, I must ensure the data I provide is high-quality. This means it needs to align with the policy set and be relevant for improving the language model.

I find many of the technical aspects enjoyable, including staying updated on research papers and developing an intuition for how language models work. I also have to research various domains for different tasks.

These domains can include locales, technologies, or even social issues. A significant part of my responsibility is to quickly build expertise, even if it's simulated, to ensure the data I provide to the engineers raises the quality bar.

Advizer Personal Links

scottn66.GitHub.io

bottom of page