Download PDF

Sergii G.

Machine Learning CV/NLP/LLM Engineer

    Enthusiastic and analytical Machine Learning Engineer with a strong foundation in data analysis, machine learning, natural language processing, and computer vision. Completed a year of academic training in Data Science and have over 3 years of software development experience with a solid background in Python. Skilled in developing and evaluating LLMs, RAG systems, and computer vision pipelines using PyTorch, OpenCV, and classical ML/DS (DNN, CNN, RNN, XGBoost, Matplotlib, etc.). Experienced in building end-to-end machine learning solutions, including multimodal models, customer churn prediction, and LLM-powered applications. Conducted research on image matching techniques (template- and keypoint-based) with a focus on accuracy, robustness, and optimization. Strong understanding of algorithm design, low-level performance, and seamless integration of open-source libraries. Passionate about applying data-driven solutions to innovative projects that drive business growth.

    PROJECTS PORTFOLIO

    2025

    Legal Document Analyzer

    Main Goal: Develop an LLM-powered assistant that can summarize, analyze, and extract key insights from legal documents.

    Role: ML Engineer.

    Tasks:

    • Implement RAG (Retrieval-Augmented Generation) for relevant document retrieval.
    • Optimize prompts for legal question-answering.
    • Integrate the system with a chatbot or web UI for user interaction.

    Technologies: Python, LangChain, OpenAI/Claude, Pinecone/FAISS, FastAPI, Streamlit, Celery, Docker.

    2025

    Adaptive AI Tutor via Dify

    Role: ML Engineer

    • Created an interactive AI tutor app using the Dify framework, powered by a fine-tuned LLaMA‑3.1 model for personalized education in math and science.
    • Configured Dify's RAG modules with FAISS to connect to a structured knowledge base, ensuring answers are grounded in curriculum-aligned material.
    • Used Dify’s frontend builder and memory system to track student progress, learning patterns, and recurring questions, enabling personalized follow-up.
    • Implemented dynamic prompts and function calling via Dify tools to adapt question difficulty and simulate step-by-step human-like explanations.
    • Launched a self-service educational chatbot via Dify’s hosted UI and API interface, accessible both via chat and embeddable widget in learning platforms.

    Technologies:  Python, Hugging Face Transformers (LLaMA‑3.1), Dify (App, RAG, Memory, Prompt Management), LangChain, FAISS, NumPy, Pandas, Streamlit, Docker

    2025

    Customer Support Assistant using Dify

    Role: ML Engineer

    • Developed an AI-powered support assistant using the Dify framework to automatically classify, summarize, and route incoming support tickets.
    • Fine-tuned a BERT-based model for intent classification and priority labeling, then integrated it into a Dify app for seamless user-facing chatbot interaction.
    • Leveraged Dify’s retrieval-augmented generation (RAG) pipeline with SentenceTransformers + Elasticsearch to surface relevant past tickets and their resolutions.
    • Used Dify’s workflow tools to integrate agent feedback into the retraining loop, enabling continuous model improvement and adaptive system behavior.
    • Deployed the complete pipeline using Dify's FastAPI backend and Docker, making it easy to scale and integrate with internal CRM tools.

    Technologies: Python, Hugging Face Transformers (BERT), Dify (App, Dataset, Feedback), SentenceTransformers, Elasticsearch, OpenAI GPT-4o, FastAPI, Docker, Pandas

    2025

    Forest Cover Type Prediction Using Ensemble Models

    Description: Developed a machine learning pipeline to predict forest cover types based on cartographic variables using ensemble models. 

    Tasks:

    • Performed exploratory data analysis to understand feature distributions and correlations, identifying patterns and potential data quality issues.
    • Engineered and selected relevant features, using techniques like PCA and statistical methods to reduce dimensionality and improve model efficiency.
    • Trained and evaluated ensemble models (Random Forest, XGBoost) with optimized hyperparameters, using cross-validation to ensure robust performance.
    • Implemented overfitting mitigation strategies, including regularization and pruning, to enhance model generalization on unseen data.

     Technologies: Python, NumPy, XGBoost, Matplotlib, Seaborn.

    2025

    AI Creative Writing Assistant

    Role: ML Engineer

    • Built a creative writing assistant that helps authors brainstorm and refine stories using an LLM fine-tuned for long-form narrative consistency and style.
    • Added a memory module to track characters, plot points, and tone guidelines, ensuring coherence and continuity across chapters of a novel or screenplay.
    • Enabled the assistant to suggest plot twists or character developments through controlled text generation techniques and user-provided cues.
    • Developed an interactive web interface where writers can iteratively edit the AI’s suggestions, preserving their own voice while benefiting from AI-driven creativity.

    Technologies: Python, Hugging Face Transformers (LLaMa-3.1 fine-tuned), PyTorch, LangChain (memory management), FAISS, NumPy, Gradio, Docker

    2024

    DL multimodal model

    A multimodal model was trained to predict the adoption rate of animals from shelters. The model use photos of pets and descriptions from the table


    Tech stack : Python, Jupyter notebook, numpy, pandas, scikit-learn , cv2, matplotlib, PyTorch, eficientnet_b3 , RobertaModel

    2024

    ML model

    Trained model for customer churn prediction


    Tech stack: Python, Jupyter notebook, numpy, pandas, scikit-learn

    2024

    Personalized Learning Tutor

    Main Goal: Create an AI-powered tutor that adapts to a student’s learning style, explains concepts, and provides quizzes.

    Role: ML Engineer.

    Tasks:

    • Design a conversational AI model for personalized learning.
    • Implement a knowledge retrieval system using vector databases.
    • Create dynamic quizzes and feedback loops based on student responses.
    • Connect the system to educational resources via APIs.

    Technologies: Python, LangChain, OpenAI, FAISS, Streamlit, FastAPI, Hugging Face Transformers.

    2024

    Predicting Fraudulent Transactions Using Logistic Regression

    Description: Built a binary classification model to detect fraudulent financial transactions using logistic regression, supported by robust statistical evaluation and cross-validation techniques.

    Tasks:

    • Developed and trained a logistic regression model to identify fraudulent activity.
    • Applied cross-validation to validate model performance and prevent overfitting.
    • Selected appropriate evaluation metrics (e.g., precision, recall, F1-score) for imbalanced data.
    • Conducted statistical testing to assess the significance of model features and predictions.

     Technologies: Python, Scikit-learn, Pandas, NumPy, Matplotlib, Seaborn.

    2024

    House Price Prediction

    Description: Develop a regression model to predict house prices based on various factors such as location, size, and amenities.

    Role in the project: ML Engineer.

    Responsibilities: Data collection, exploratory data analysis (EDA), feature selection, model training, and deployment.

    Technical summary: Python, scikit-learn, pandas, XGBoost, Flask/FastAPI for deployment.

    2024

    Credit Risk Assessment

    Description: Build a classification model to predict loan default risk using financial and demographic data.

    Role in the project: Data Scientist.

    Responsibilities: Data cleaning, feature engineering, imbalanced data handling, model selection, and explainability.

    Technical summary: Python, scikit-learn, LightGBM, SHAP for interpretability, SQL for data extraction.

    2023

    Sonic App

    Web-based chatbot that can upload PDF documents and provide question-answering capabilities based on the content of uploaded document

    Tech stack : Python, Django, LLM ,Pandas, Docker, Alembic, GitHub

    2023

    Customer Churn Prediction

    Description: Predict whether a customer is likely to stop using a product or service based on historical data.

    Role in the project: Data Scientist.

    Responsibilities: Data preprocessing, feature engineering, model training, and evaluation.

    Technologies: Python, pandas, scikit-learn, XGBoost, TensorFlow/PyTorch, Streamlit for visualization.

    2023

    Recommender System for E-commerce

    Description: Develop a recommendation system for an e-commerce platform to suggest relevant products to users.

    Role in the project: Data Scientist.

    Responsibilities: Data preprocessing, collaborative filtering/content-based recommendations, model optimization, evaluation.

    Technical summary: Python, pandas, scikit-learn, Surprise library, TensorFlow, FastAPI/Flask for API deployment.

    2022

    Fraud Detection in Transactions

    Description: Identify fraudulent transactions in financial datasets using machine learning techniques.

    Role in the project: ML Engineer.

    Responsibilities: Data preprocessing, anomaly detection, building supervised/unsupervised models, real-time deployment.

    Technical summary: Python, scikit-learn, TensorFlow/PyTorch, Isolation Forest, Autoencoders, Kafka for streaming.

    2022

    AI-Powered Styled Picture Generation

    Main Goal: Create a system that generates stylized images of people, pets, or objects in various artistic styles (anime, steampunk, cyberpunk, etc.).

    Role: ML Engineer.

    Tasks:

    • Fine-tune a Stable Diffusion model using DreamBooth for custom image generation.
    • Implement a UI where users can upload images and select styles.
    • Optimize inference for faster generation using model quantization or LoRA.
    • Deploy as a web app with real-time image processing.

    Technologies: Python, Stable Diffusion, DreamBooth, OpenCV, Streamlit, FastAPI, Hugging Face Diffusers, Docker.

    2022

    Real-Time Car Detection and Classification

    Main Goal: Develop a Computer Vision system that detects cars in real time and classifies their make, model, and color.

    Role: ML Engineer.

    Tasks:

    • Collect and preprocess car images for training.
    • Train an object detection model (YOLOv8, Faster R-CNN) for car detection.
    • Implement a classification model for make/model/color recognition.
    • Deploy the system as an API for integration with traffic monitoring or security cameras.

    Technologies: Python, PyTorch/TensorFlow, OpenCV, YOLOv8/Faster R-CNN, FastAPI, Docker.

    2022

    3D Sphere Move

    This project was a test task. There is a sphere which moves thru a 3D surface on a specific path and deletes each point it touches. My task was to find this points and save the 1st layer of the surface.

    Skills achieved: Basics of OOP Linear algebra

    WORK EXPERIENCE

    Python Teacher

    GoITeens

    Teaching python for teenagers;

    Cooperation with team Students wrote and successfully submitted Telegram bot projects, Flask app projects.

    Sales Manager

    SAV ORBICO

    Communication with clients
    Non food FMCG sales
    Sales analytic
    Cooperation with team and other departments
    Сontributed to a 10% increase in the turnover of the company's branch

    Skills

    Programming & Data Languages:
    • Python

    • SQL

    • NoSQL
    DB
    • Pinecone/FAISS, ChromoDB
    • PostgreSQL
    • SQL
    Frameworks & Libraries:
    • Flask
    • Pandas
    • NumPy
    • Matplotlib
    • PyTorch
    • scikit-learn
    Machine Learning:
    • Jupyter Notebook
    •  Git
    Tools & Technologies
    • Jira

    • Git

    • Open-source library integration

    • Performance optimization

    • Real-time system development

    Professional & Soft Skills
    • Analytical Thinking

    • Effective Communication

    • Multitasking

    • Problem solving
    • Research & experimentation

    Professional Skills

    • English - Upper-Intermediate (B2)

    Education

    2023

    Master's degree in Computer Science: Data Science

    Woolf GoIT Neoversity
    20222023

    Python developer, Data Science

    IT School GoIT
    20192023

    Bachelor of Economics

    Donetsk National University of Economics and Trade