Junior LLM Engineer – Entry-Level NLP & AI Developer

Оплата: По договоренности

Удаленно

Full-time

You’ll join an innovation-first culture where junior voices influence architecture decisions. Expect mentorship from senior researchers, regular paper-club sessions, and the freedom to prototype your boldest ideas—then see them ship to millions of users across fintech, retail, and ed-tech products. Remote-first collaboration tools keep the playing field level, whether you’re dialing in from a campus apartment or a coworking loft.

Key Responsibilities

- Build, fine-tune, and evaluate transformer-based LLMs using Python and PyTorch.

- Code reusable NLP pipelines for tokenization, embedding, and prompt engineering.

- Run automated experiments—hyper-parameter sweeps, A/B tests, bias audits—to improve model accuracy.

- Diagnose model drift, latency spikes, and hallucinations; craft fixes and regression tests.

- Maintain clean, versioned datasets (JSON, Parquet, vector stores) through data validation scripts.

- Document architectures, training logs, and experiment outcomes for peer review.

- Partner with MLOps engineers to containerize models (Docker) and publish to CI/CD workflows.

- Support production rollouts, monitoring dashboards, and user feedback loops.

- Contribute to design reviews, sprint planning, and code reviews in an Agile squad.

Required Skills

- Bachelor’s degree in Computer Science, Data Science, or related discipline.

- Proficient in Python 3.x plus libraries such as PyTorch, TensorFlow, Hugging Face.

- Solid grasp of NLP fundamentals—tokenization, attention, embeddings, beam search.

- Familiarity with prompt engineering, reinforcement learning from human feedback (RLHF), and retrieval-augmented generation.

- Experience with Git, Linux command line, and RESTful API consumption.

- Analytical mindset; able to translate metrics like perplexity and BLEU into product insights.

- Clear written and verbal communication; comfortable explaining complex ideas to non-technical peers.

- Growth attitude—curious, coachable, and eager to explore new open-source papers.

Preferred Extras (nice-to-have)

- Coursework or projects using distributed training on GPUs (CUDA, NCCL).

- Hands-on with vector databases (FAISS, Milvus) or cloud ML services (AWS SageMaker, GCP Vertex AI).

- Participation in Kaggle competitions or open-source contributions related to NLP.

- Knowledge of data privacy regulations (GDPR, CCPA) impacting AI pipelines.