Junior LLM Engineer – Entry-Level NLP & AI Developer

Оплата: По договоренности
Удаленно
Full-time

You’ll join an innovation-first culture where junior voices influence architecture decisions. Expect mentorship from senior researchers, regular paper-club sessions, and the freedom to prototype your boldest ideas—then see them ship to millions of users across fintech, retail, and ed-tech products. Remote-first collaboration tools keep the playing field level, whether you’re dialing in from a campus apartment or a coworking loft.  


Key Responsibilities  

- Build, fine-tune, and evaluate transformer-based LLMs using Python and PyTorch.  

- Code reusable NLP pipelines for tokenization, embedding, and prompt engineering.  

- Run automated experiments—hyper-parameter sweeps, A/B tests, bias audits—to improve model accuracy.  

- Diagnose model drift, latency spikes, and hallucinations; craft fixes and regression tests.  

- Maintain clean, versioned datasets (JSON, Parquet, vector stores) through data validation scripts.  

- Document architectures, training logs, and experiment outcomes for peer review.  

- Partner with MLOps engineers to containerize models (Docker) and publish to CI/CD workflows.  

- Support production rollouts, monitoring dashboards, and user feedback loops.  

- Contribute to design reviews, sprint planning, and code reviews in an Agile squad.  


Required Skills  

- Bachelor’s degree in Computer Science, Data Science, or related discipline.  

- Proficient in Python 3.x plus libraries such as PyTorch, TensorFlow, Hugging Face.  

- Solid grasp of NLP fundamentals—tokenization, attention, embeddings, beam search.  

- Familiarity with prompt engineering, reinforcement learning from human feedback (RLHF), and retrieval-augmented generation.  

- Experience with Git, Linux command line, and RESTful API consumption.  

- Analytical mindset; able to translate metrics like perplexity and BLEU into product insights.  

- Clear written and verbal communication; comfortable explaining complex ideas to non-technical peers.  

- Growth attitude—curious, coachable, and eager to explore new open-source papers.  


Preferred Extras (nice-to-have)  

- Coursework or projects using distributed training on GPUs (CUDA, NCCL).  

- Hands-on with vector databases (FAISS, Milvus) or cloud ML services (AWS SageMaker, GCP Vertex AI).  

- Participation in Kaggle competitions or open-source contributions related to NLP.  

- Knowledge of data privacy regulations (GDPR, CCPA) impacting AI pipelines.