Research Engineer - Multimodal Generation
Are you excited by the challenge of advancing AI-driven content and data generation across language, vision, and audio? At Monolith, we are building state-of-the-art multimodal generation tools that empower engineers and scientists to innovate faster, smarter, and more creatively. Join our mission-focused team to shape the next generation of technology that bridges text, images, and sound for real-world engineering use cases.
Accelerate Your Engineering Career with Multimodal AI Generation
At Monolith, you will collaborate with domain experts, AI researchers, and engineers who are breaking new ground at the intersection of language, vision, and audio. Our collaborative and inclusive culture will support your growth as you help translate academic breakthroughs into high-impact production systems.
Key Responsibilities – Multimodal AI Research and Engineering
- Design, develop, and optimise multimodal generative models for combining text, images, and audio in engineering workflows.
- Lead and contribute to research initiatives that advance the capabilities of AI-based content and data generation across multiple domains.
- Implement scalable and robust machine learning pipelines integrating state-of-the-art multimodal architectures (e.g. transformers, diffusion models, large language models).
- Collaborate with multidisciplinary teams—including domain experts and product stakeholders—to translate user requirements into technical solutions.
- Stay up-to-date with advances in generative AI, presenting research findings to colleagues and integrating latest academic insights into our platform.
Who You Are – Experience & Skills for Multimodal Generation Engineering
- Experience in building and deploying machine learning models for multimodal data (text, images, audio, or video) within research or product contexts.
- Deep understanding of modern generative AI approaches (such as diffusion models, GANs, VAEs, vision-language models, or large language models).
- Proficient programming skills in Python and relevant ML frameworks (e.g. PyTorch, TensorFlow, JAX), with strong data engineering abilities.
- Excellent analytical and problem-solving skills; capable of communicating technical concepts clearly to both technical and non-technical teams.
- Collaborative mindset, comfortable working independently as well as part of diverse multidisciplinary teams in a remote-first environment.
Desirable Qualities and Certifications for a Multimodal Research Engineer
- Research background or publications in language-vision models, audio-visual processing, or multimodal data integration.
- Familiarity with cloud-based ML/AI infrastructure and deploying production-scale data and inference pipelines.
- Contribution to open-source AI projects or participation in machine learning competitions.
- Relevant advanced degree (MSc, PhD) in AI, computer science, computational engineering, or a related discipline (not mandatory).
Work Culture, Benefits, and Opportunities for Growth at Monolith
- Remote-first working model, with the option to collaborate in our London HQ.
- Flexible work arrangements and respect for individual needs and work–life balance.
- Competitive, gender-neutral compensation and comprehensive benefits package.
- Continuous learning and professional development support including conferences and certifications.
- An empowering, inclusive environment where every perspective and contribution is valued.
How to Succeed as a Research Engineer in Multimodal Generation
If you thrive on autonomy, are motivated by inventing solutions to complex technical challenges, and are passionate about delivering real-world impact with AI, we’d love to meet you. We welcome applicants from every background who care about innovation, knowledge-sharing, and responsible AI development.
Excited to Build the Future of Multimodal Generative AI?
We strongly encourage candidates of all genders, ethnicities, abilities, and backgrounds to apply. If you are ready to elevate AI-driven multimodal generation and make a difference for global engineering, apply now and join Monolith in shaping the future of intelligent technology!
- Department
- Data Science
- Locations
- London, United Kingdom
- Remote status
- Hybrid
Already working at Monolith AI?
Let’s recruit together and find your next colleague.