Educational Requirements:
• Graduate degree in Linguistics or related field is a must; PhD is a plus
• A graduate degree in Literature or English is not an appropriate substitution
• A background or specialization in corpus linguistics is a plus
• Experience with field work is a plus
• Must have a very firm grasp of the following linguistic fields: language typology, syntax, morphology, sociolinguistics (especially dialectology and discourse analysis), corpus linguistics, writing systems, pragmatics, phonology.
• Must have some experience with applying basic Natural Language Processing techniques
• Experience working cross-functionally
• Experience collaborating with machine learning, NLP, or software engineers, or data scientists
• Experience contributing to research papers
Must-Have Skills:
• 3+ years of experience applying strong theoretical linguistics skills to language technology
• Must have native or near-native proficiency / fluency in Portuguese (especially Brazilian)
• Must be able to code in Python (must) and query databases using SQL, other coding languages used for data analysis (e.g., R) are a plus
• Ability to handle lots of stress-testing and red-teaming exercises (toxic language that pops up)
Nice-to-Have Skills:
• Working knowledge in other languages is a plus; proficiency in a low-resource language is valued
o Education or training in the basics of project management is a plus
o Self-motivation is a must
• Working knowledge of international language-classification standards is valued
• Project Management experience – working a lot with vendors
• Ability to track projects, estimate whether workload the vendor is assessing is correct or not
Soft Skills:
• Must have strong written and spoken communication skills, especially business and research communication
• Must be able to independently work through complex requests and perform under pressure
• Strong ability to work independently, prioritize, plan, and track work, as well as report progress
• Ability to ask questions when needed (researcher mentality)
• Extremely good / impeccable English communication (does not need to be a Native speaker)
The main function of a Linguist is to determine speech data needs and setting strategic vision for data-based model and product improvements.
Candidate Value Proposition:
The individual will be joining a team that has been working on 200 language systems, which has made the nature journal and stamped as the universal translator. This is a great experience for individuals interested in translation or language technology!
Role Responsibilities (including, but not limited to):
• Perform linguistic analyses on large datasets
• Perform linguistic error analysis and quality assessment on AI model outputs, determining what the most frequent and severe error categories are
• Write and revise guidelines for human annotation, translation and dataset creation projects
• Conduct typological and sociolinguistic research on many languages, highlighting their similarities and differences
• Perform linguistic analyses for Responsible AI (toxic language, hate speech, gender bias and other cultural biases) in massively multilingual settings
• Conduct linguistic literature reviews on various NLP-adjacent topics and summarize findings
• Evaluate the quality of human translations or human-generated data, identify error patterns, and provide actionable feedback
• Provide information or guidance relative to any aspect of linguistic knowledge (typology, morpho-syntax, sociolinguistics, classification, phonetics/phonology, pragmatics, etc.)
• Reach out to and collaborate with native speakers of various languages
• Communicate results of linguistic analyses to engineers and research scientists
Performance Measurement:
Performance is measured on how well this individual completes work packages.
Past FAIR projects that have been strongly supported by our Linguistics group
• No Language Left Behind (text-to-text machine translation for 200 languages)
o Demonstrated non-English-centric translation capabilities for low-resource languages
o Constituted a scientific breakthrough published in Nature
o Currently powers the UNESCO translator on Hugging Face
• Seamless (speech-to-speech translation for 100 languages)
o Demonstrated expressive translation capabilities for 100 languages
o Was included in the Time's 200 Best Inventions (2023)
Skills Assessment:
• 1st Round – Technical interview with CWAM + linguist engineer, 1 hour
o Applying theoretical linguistics to various situations
o Second part of first round is being able to show coding skills in Python (databases, sometimes queries in SQL, mostly Python)
• 2nd Round – Behavioral, 30-45 mins
- **Only those lawfully authorized to work in the designated country associated with the position will be considered.**
- **Please note that all Position start dates and duration are estimates and may be reduced or lengthened based upon a client’s business needs and requirements.**
It was great working for Rose International. Everyone was extremely helpful.
Rosann, Consultant
Rose is an assembly of people grounded in honesty, truth and dignity for all of its employees and contractors.
Samba, Consultant
I am very happy with the Rose International, and the professionalism of the employees.
Robin, Consultant
Each time I contacted Rose, I was completely satisfied with the great attention and customer service I received. Each person was extremely knowledgeable and patient with my concerns or questions.
Diana, Consultant
I have been very pleased with my experience with Rose International. Everyone that I encountered was very helpful and courteous.
Stephanie, Consultant
EMPLOYEE COMMENTS