Senior Data Scientist

Job Description

We are seeking a Senior Data Scientist with deep expertise in unstructured data (audio, speech, text, images, etc.) and a strong background in deploying Large Language Models (LLMs) and AI models at scale. This role focuses on real-world implementation, ensuring that models are efficient, scalable, and optimized for enterprise deployment.

You will work closely with large enterprises, delivering AI-powered solutions that meet real-world performance benchmarks (speed, latency, throughput). The ideal candidate has hands-on experience optimizing LLMs through quantization and pruning, designing distributed training pipelines, and working with AI agents to build end-to-end products beyond just leveraging open-source tools. This role requires a deep understanding of Large Language Models (LLMs), multimodal architectures, and cutting-edge optimization techniques such as quantization, pruning, model distillation, and retrieval-augmented generation (RAG).

Key Responsibilities

Develop and deploy AI models for unstructured data (text, speech, audio, images) with a focus on enterprise-scale performance.
Fine-tune, optimize, and deploy LLMs and multimodal models, integrating distributed training, quantization, and pruning techniques for efficiency.
Design and implement production-ready AI solutions, ensuring scalability, low-latency inference, and high throughput.
Work with AI agents and automation frameworks to create intelligent, real-world AI applications for enterprise clients.
Build and maintain end-to-end LLM Ops pipelines, ensuring efficient training, deployment, monitoring, and model updates.
Implement vector search and retrieval-augmented generation (RAG) systems for large-scale data solutions.
Monitor AI performance using key metrics such as speed, latency, and throughput, continuously refining models for real-world efficiency.
Work with cloud-based AI infrastructure (AWS, GCP) and containerized environments (Docker, Kubernetes) to scale AI solutions.
Collaborate with engineering, DevOps, and product teams to align AI solutions with business needs and client requirements.
Implement data curation pipelines, including data collection, cleaning, deduplication, decontamination, etc. for training high-quality AI models.
Implement self-instruct and synthetic data generation techniques to enrich datasets for low-resource languages and specialized domains.

Required Qualifications

5+ years of hands-on experience in AI, Machine Learning, and Data Science, with a strong focus on production-scale AI.
Expertise in LLMs, including fine-tuning, distributed training, quantization, and pruning techniques.
Experience working with OCR, ASR, and TTS applications in real-world deployments.
Proven experience deploying AI models in production, with real-world examples of scaled AI applications.
Strong understanding of cloud computing, containerization (Docker, Kubernetes), and ML Ops best practices.
Proficiency in Python, PyTorch, and ML libraries.
Hands-on experience with vector databases and retrieval-augmented generation (RAG) architectures.
Strong awareness of AI system performance benchmarks (latency, speed, throughput) and ability to optimize models accordingly.
Experience working with AI agents, designing real-world intelligent automation solutions beyond just open-source experimentation.
Proficiency in transformer-based architectures (BERT, GPT, LLaMA, Whisper, etc.), including pre-training, fine-tuning, and task-specific adaptation.
Expertise in distributed training methodologies, including ZeRO-Offloading, Deep Speed, and FSDP.
Experience in large-scale data curation including data cleaning, formatting, deduplication, decontamination, etc.

Preferred Qualifications

Experience in multi-modal AI models that integrate text, speech, and vision.
Hands-on work with self-supervised learning, few-shot learning, and reinforcement learning.
Designed and deployed AI solutions for large enterprises, ensuring high availability, robustness, and business impact.
Knowledge of AI inference optimization techniques for real-time applications.

Beyond Limits is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other legally protected status.

NameEmail Address

Please upload your CV to apply.

Upload File

Max file size 10MB.

Uploading...

fileuploaded.jpg

Upload failed. Max size for files is 10 MB.

By submitting your CV, you consent to the processing of your personal data in accordance with the General Data Protection Regulation (GDPR). Your information will be used solely for recruitment purposes and will not be shared with third parties without your consent. You may request access, correction, or deletion of your data at any time by contacting recruiting@beyond.ai.

Position (Please select the position you are applying for)

Thank you! Your submission has been received!

Key Responsibilities

Develop and deploy AI models for unstructured data (text, speech, audio, images) with a focus on enterprise-scale performance.
Fine-tune, optimize, and deploy LLMs and multimodal models, integrating distributed training, quantization, and pruning techniques for efficiency.
Design and implement production-ready AI solutions, ensuring scalability, low-latency inference, and high throughput.
Work with AI agents and automation frameworks to create intelligent, real-world AI applications for enterprise clients.
Build and maintain end-to-end LLM Ops pipelines, ensuring efficient training, deployment, monitoring, and model updates.
Implement vector search and retrieval-augmented generation (RAG) systems for large-scale data solutions.
Monitor AI performance using key metrics such as speed, latency, and throughput, continuously refining models for real-world efficiency.
Work with cloud-based AI infrastructure (AWS, GCP) and containerized environments (Docker, Kubernetes) to scale AI solutions.
Collaborate with engineering, DevOps, and product teams to align AI solutions with business needs and client requirements.
Implement data curation pipelines, including data collection, cleaning, deduplication, decontamination, etc. for training high-quality AI models.
Implement self-instruct and synthetic data generation techniques to enrich datasets for low-resource languages and specialized domains.

Required Qualifications

5+ years of hands-on experience in AI, Machine Learning, and Data Science, with a strong focus on production-scale AI.
Expertise in LLMs, including fine-tuning, distributed training, quantization, and pruning techniques.
Experience working with OCR, ASR, and TTS applications in real-world deployments.
Proven experience deploying AI models in production, with real-world examples of scaled AI applications.
Strong understanding of cloud computing, containerization (Docker, Kubernetes), and ML Ops best practices.
Proficiency in Python, PyTorch, and ML libraries.
Hands-on experience with vector databases and retrieval-augmented generation (RAG) architectures.
Strong awareness of AI system performance benchmarks (latency, speed, throughput) and ability to optimize models accordingly.
Experience working with AI agents, designing real-world intelligent automation solutions beyond just open-source experimentation.
Proficiency in transformer-based architectures (BERT, GPT, LLaMA, Whisper, etc.), including pre-training, fine-tuning, and task-specific adaptation.
Expertise in distributed training methodologies, including ZeRO-Offloading, Deep Speed, and FSDP.
Experience in large-scale data curation including data cleaning, formatting, deduplication, decontamination, etc.

Preferred Qualifications

Experience in multi-modal AI models that integrate text, speech, and vision.
Hands-on work with self-supervised learning, few-shot learning, and reinforcement learning.
Designed and deployed AI solutions for large enterprises, ensuring high availability, robustness, and business impact.
Knowledge of AI inference optimization techniques for real-time applications.

Oops! Something went wrong while submitting the form.