Data Engineer
<div class="show-more-less-html__markup show-more-less-html__markup--clamp-after-5 relative overflow-hidden"> <strong><strong>Why Omnilex?<br/><br/></strong></strong>At Omnilex, we’re on a mission to transform the way lawyers work. Our AI-native platform lets legal professionals enhance their productivity in legal research and automate workflows. We collaborate closely with our clients and iterate at a market-leading pace. In a year, we have gone from an early MVP to a product used daily by thousands of legal professionals at our clients in Switzerland, Germany and Liechtenstein - and are now scaling rapidly across Europe.<br/><br/>We already stand out with our strong data engineering, including our combination of external data, customer-internal data and our own innovative AI-first legal commentaries.<br/><br/>You’ll be joining a young, passionate, and dynamic team of 15, with roots at ETH Zurich.<br/><br/><strong><strong>Your role<br/><br/></strong></strong>Are you excited about turning messy, multi-jurisdiction legal content into clean, structured, and AI-ready data? Do you enjoy building reliable pipelines for extraction, normalization, chunking, citation handling, tagging, structuring, summarizing, and indexing; then measuring quality and cost? Do you thrive in a fast-paced startup where your work directly powers search, AI answer quality, and analytics? If so, we’d love to hear from you!<br/><br/><strong>What You'll Do<br/><br/></strong><ul><li>As a Data Engineer focused on AI data processing & integration, your primary focus will be building and owning data flows that make our AI features accurate, explainable, and scalable</li><li>Design and maintain ingestion for legal sources (APIs, scraping, bulk data) across jurisdictions with strong reliability and compliance</li><li>Normalize and model heterogeneous sources into pragmatic, typed schemas (statutes, decisions, commentaries, citations, metadata)</li><li>Implement citation-aware chunking, sectioning, and cross-referencing so RAG is precise, traceable, and cost-efficient</li><li>Build enrichment pipelines for tagging, classification, summarization, embeddings, entity extraction, and graph relationships; using AI where it helps</li><li>Improve search quality via better indexing strategies, analyzers, synonyms, ranking, and relevance evaluation</li><li>Establish data quality, lineage, and observability (QA checks, coverage metrics, regression tests, versioning)</li><li>Optimize performance, runtime complexity, DB query times, token usage, and overall pipeline cost</li><li>Collaborate closely with users and customers to translate user problems and company requirements into robust data and SLAs</li><li>Communicate your work and findings to the team for continuous feedback and improvement (in English)<br/><br/></li></ul><strong><strong>What you bring <br/><br/></strong></strong><strong>Minimum Qualifications<br/><br/></strong><ul><li>Degree in Computer Science, Data Science, or a related field; or equivalent practical experience</li><li>Strong hands-on experience in data engineering with TypeScript</li><li>Solid grasp of data structures, algorithms, regexes, and SQL (PostgreSQL)</li><li>Experience using LLMs/embeddings for practical data tasks (chunking, tagging, summarization, RAG-ready pipelines)</li><li>Ability to learn quickly and adapt to a dynamic startup environment, with strong ownership and product mindset</li><li>Availability full-time. On-site in Zurich at least two days per week (hybrid)<br/><br/></li></ul><strong>Preferred Qualifications<br/><br/></strong><ul><li>You have a Swiss work permit or EU/EFTA citizenship</li><li>Working proficiency in German (much of our legal data is in German) and proficiency in English</li><li>Experience with Azure (incl. Azure AI/Cognitive Search), Docker, and CI/CD</li><li>Familiar with modern scraping/parsing stacks (Playwright/Puppeteer, PDF tooling, OCR)</li><li>Experience with vector indexing, relevance evaluation, and search ranking</li><li>Familiar with our stack: Azure / NestJS / Next.js</li><li>Knowledge and experience with legal systems, in particular Switzerland, Germany, USA <br/><br/></li></ul><strong>Benefits<br/><br/></strong><ul><li>Direct impact: your pipelines immediately improve search, answers, and user trust, transforming legal research</li><li>Autonomy & ownership: Own across ingestion, processing, enrichment, and indexing</li><li>Team: Professional growth at the intersection of legal, data, and AI with an interdisciplinary team</li><li>Compensation: CHF 8’000–12’000 per month + ESOP (employee stock options), depending on experience and skills. <br/><br/></li></ul>We’re excited to hear from candidates who are passionate about data engineering and eager to make an impact in the legal tech space.<br/><br/> </div>