Senior LLM Engineer

San Sebastián, Guipúzcoa
Permanente
Tiempo completo

Hace 1 mes

Multiverse ComputingMultiverse is a well-funded, fast-growing deep-tech company founded in 2019. We are the largest quantum software company in the EU and have been recognized by CB Insights (2023 and 2025) as one of the 100 most promising AI companies in the world.With 180+ employees and growing, our team is fully multicultural and international. We deliver hyper-efficient software for companies seeking a competitive edge through quantum computing and artificial intelligence.Our flagship products, CompactifAI and Singularity, address critical needs across various industries:CompactifAI is a groundbreaking compression tool for foundational AI models based on Tensor Networks. It enables the compression of large AI systems—such as language models—to make them significantly more efficient and portable.Singularity is a quantum- and quantum-inspired optimization platform used by blue-chip companies to solve complex problems in finance, energy, manufacturing, and beyond. It integrates seamlessly with existing systems and delivers immediate performance gains on classical and quantum hardware.You’ll be working alongside world-leading experts to develop solutions that tackle real-world challenges. We’re looking for passionate individuals eager to grow in an ethics-driven environment that values sustainability and diversity.We’re committed to building a truly inclusive culture—come and join us.As a Senior LLM Engineer, you will

Design and implement strategies for creating, sourcing, and augmenting datasets tailored for LLM training and fine-tuning.

Develop scalable pipelines to collect, clean, filter, annotate, and validate large volumes of text data, ensuring quality, ethical compliance, etc.

Collaborate with ML engineers, researchers, and software engineers to achieve ambitious goals in the preparation of LLMs and complementary work (preparing datasets, model evaluation, model serving, etc.).

Develop and integrate new routines for modifying and enhancing LLMs, and extending their functionality.

Make effective use of distributed compute resources and clusters (GPU’s), identify opportunities for further optimization.

End-to-end preparation of compressed and specialized LLMs for use in production.

Keep up to date with research trends in LLM foundation models, dataset curation, LLM pretraining data, and benchmarking.

Contribute to building documentation, development standards, and a healthy shared code base.

Mentor other engineers and provide knowledge sharing of cutting-edge techniques.

Required Qualifications

Master’s, or Ph.D. in Computer Science, AI, Data Science, Physics, Math, or a related field. Or equivalent industry experience.

3+ years of experience in data science, machine learning, or related roles, with demonstrated experience with NLP or LLMs.

In-depth knowledge of large foundational model architectures (language and multimodal models) and their lifecycle: training, fine-tuning, alignment, and evaluation.

Proficient in Python and data tooling ecosystems (Pandas, NumPy, Hugging Face Datasets & Transformers libraries).

Hands-on experience with text data collection from diverse sources: web scraping, APIs, proprietary corpora, etc.

Strong understanding of data quality metrics including bias detection, toxicity, and readability.

Experience working in large shared distributed computing environments, familiarity with relevant tools for hardware optimization (vLLM, TensorRT, NeMo, etc.).

Experience with version control (git), unit testing, and other fundamental aspects of software development.

Effective communication and interpersonal abilities.

Preferred Qualifications

Experience building or contributing to datasets used in LLM pretraining or supervised fine-tuning.

Experience building foundational LLMs from the ground up

Familiarity with alignment techniques (e.g., reinforcement learning, preference modeling, reward modeling).

Exposure to multilingual and low-resource language datasets.

Contributions to open-source datasets, tools, or publications in dataset-centric research.

Knowledge of ethical AI, data governance, privacy laws (e.g., GDPR), and responsible data use.

Familiarity with the software development lifecycle and agile methodologies

Perks & Benefits

Indefinite contract.

Equal pay guaranteed.

Variable performance bonus.

Signing bonus.

We offer work visa sponsorship (If applicable).

Relocation package (if applicable).

Private health insurance.

Eligibility for educational budget according to internal policy.

Hybrid opportunity.

Flexible working hours.

Language classes and discounted lunch options

Working in a high paced environment, working on cutting edge technologies.

Career plan. Opportunity to learn and teach.

Progressive Company. Happy people culture

As an equal opportunity employer, Multiverse Computing is committed to building an inclusive workplace. The company welcomes people from all different backgrounds, including age, citizenship, ethnic and racial origins, gender identities, individuals with disabilities, marital status, religions and ideologies, and sexual orientations to apply.Come and join our multicultural team!5 locations
+27 languages

Multiverse Computing

Solicitar