Innodata Inc.
Statistics Specialist
Job Location
Mexico, Mexico
Job Description
Innodata (NASDAQ: INOD) is a leading data engineering company. With more than 2,000 customers and operations in 13 cities around the world, we are an AI technology solutions provider of choice for 4 out of 5 of the world’s biggest technology companies, as well as leading companies across financial services, insurance, technology, law, and medicine. By combining advanced machine learning and artificial intelligence (ML/AI) technologies, a global workforce of subject matter experts, and a high-security infrastructure, we’re helping usher in the promise of AI. Innodata offers a powerful combination of both digital data solutions and easy-to-use, high-quality platforms. Our global workforce includes over 5,000 employees in the United States, Canada, United Kingdom, the Philippines, India, Sri Lanka, Israel and Germany. We’re poised for a period of explosive growth over the next few years. About the Role: At Innodata, we’re working with the world’s largest technology companies on the next generation of generative AI and large language models (LLMs). We’re looking for smart, savvy, and curious subject matter experts. We are seeking a highly skilled Data Scientist with a Master’s or PhD in Statistics, Applied Mathematics, or a related quantitative field to join our cutting-edge AI team. In this role, you will contribute to the development, training, and fine-tuning of Large Language Models (LLMs) by applying statistical methods, data analysis, and model evaluation techniques. Your expertise will directly impact the accuracy, performance, and robustness of advanced AI systems deployed for real-world applications. Key Responsibilities Design and implement statistical frameworks for the preparation, analysis, and quality assessment of large-scale datasets used in training LLMs. Perform exploratory data analysis (EDA) to detect biases, data inconsistencies, and anomalies in training corpora. Collaborate closely with AI researchers and engineers to develop data annotation guidelines and evaluation protocols. Apply advanced statistical methods to assess model outputs and guide iterative training improvements. Develop and maintain automated tools for data preprocessing, labeling, sampling, and quality control in LLM pipelines. Conduct rigorous hypothesis testing, A/B testing, and performance benchmarking of LLMs across various tasks. Interpret model behaviors using statistical insights and provide actionable recommendations for model fine-tuning. Document methodologies, maintain reproducible workflows, and contribute to technical publications. Key Qualifications Master’s or PhD in Statistics, Applied Mathematics, Data Science, Computer Science (with strong statistics focus), or related quantitative field. Strong knowledge of statistical modeling, probability theory, and experimental design. Hands-on experience with large-scale datasets, data cleaning, feature engineering, and statistical analysis tools. Proficiency in programming languages such as Python, R, or Julia, including libraries like pandas, NumPy, SciPy, scikit-learn, or statsmodels. Familiarity with machine learning frameworks (TensorFlow, PyTorch) and understanding of LLM architectures (transformers, attention mechanisms). Experience with annotation tools and data labeling processes is a plus. Strong analytical thinking, attention to detail, and problem-solving skills. Excellent written and verbal communication skills in English. As part of the project, you are required to complete the English language assessment. *The assessment is mandatory & non-billable* If interested, kindly share your updated resume at : tsingh3@innodata.com
Location: Mexico, MX
Posted Date: 9/14/2025
Location: Mexico, MX
Posted Date: 9/14/2025
Contact Information
Contact | Human Resources Innodata Inc. |
---|