Software Engineer — ML Infrastructure

None  •  IT & Software  •  Wien, Austria

<div class="show-more-less-html__markup show-more-less-html__markup--clamp-after-5 relative overflow-hidden"> <p><strong>Ora Computing · Vienna · Full-time</strong></p><p><br/></p><p><span>We compress large language models (LLMs). Our information-theoretic structural pruning and quantization algorithm shrinks model footprints by over 80% without retraining, in hours rather than weeks.</span></p><p><span>We closed a €3.5M seed in 2026 (Constructor, Greencode Ventures, XISTA) and are working with customers in automotive, edge inference, and cloud.</span></p><p><br/></p><p><strong>The role</strong></p><p><br/></p><p><span>You'll own how our software stack is built. Today the codebase reflects four people moving fast, it works, but it needs structure. Your job is to give it that structure: well-designed libraries, robust packages, environments, the kind of codebase that scales as we grow the team and ship more to customers.</span></p><p><span>This is not a glue-code role. You'll work between the algorithm and inference layer: designing a compression pipeline that is fully automated and takes target runtimes into account. You'll design the abstractions our compression pipeline runs on and make them fast.</span></p><p><br/></p><p><strong>What you'll work on</strong></p><p><br/></p><ul><li><span>Designing and refactoring our core libraries — pruning, quantization, retraining, evaluation — into clean, well-scoped packages</span></li><li><span>Building the internal tooling that lets the team move quickly without breaking things — CI, benchmarks, reproducible runs</span></li><li><span>Integrating our compression output with inference engines (vLLM, TensorRT-LLM, llama.cpp) and customer deployment targets</span></li><li><span>Setting the engineering bar for the team as we hire</span></li></ul><p><br/></p><p><strong>What we're looking for</strong></p><p><br/></p><ul><li><span>Bachelor’s/Master's in computer science or equivalent, plus 2+ years of professional software engineering</span></li><li><span>Strong opinions about code design. You know what a well-structured library looks like and why</span></li><li><span>GPU experience — memory hierarchy, kernels, what bottlenecks performance — even if you don't write CUDA daily</span></li><li><span>Production-grade Python. You write code others can read, extend, and trust</span></li><li><span>You finish things and you care about the codebase you leave behind</span></li></ul><p><br/></p><p><strong>Bonus</strong></p><p><br/></p><ul><li><span>Open-source contributions to ML infrastructure (vLLM, llama.cpp, transformers, TensorRT-LLM, PyTorch internals)</span></li><li><span>CUDA, Triton, or kernel-level work</span></li><li><span>Experience designing a library from scratch that other engineers ended up using</span></li><li><span>Familiarity with model serving and inference optimization</span></li></ul><p><br/></p><p><strong>Practical</strong></p><p><br/></p><ul><li><span>Vienna-based. Hybrid or fully remote</span></li><li><span>Working language is English</span></li><li><span>We sponsor visas and support relocation</span></li><li><span>Compensation: €70–120k base + equity. Austrian minimum disclosed per Kollektivvertrag: €45,738/year</span></li><li><span>You'll set the engineering standards we hire against next</span></li></ul><p><br/></p><p><strong>How to apply</strong></p><p><br/></p><p><span>Send CV, a code sample you're proud of, and any open-source links to info@oracomputing.com. Tell us in two paragraphs what you'd want to build at Ora and why. We respond within a week.</span></p><p><br/></p> </div>

Job Overview
  • Datum der Veröffentlichung

    Jun 09, 2026

  • Kategorie

    IT & Software

  • Job Type

  • Standort

    Wien, Austria

  • Arbeitgeber

    Ora Computing

  • Source

    LinkedIn