Photo of Rahul Holla

Name : Rahul M Holla

Job Role : SDE II

Experience : 2 Years 6 Months

Education : B. Engg, VTU

Address : Bengaluru, India

Technical Skills

SQL & Databases 85%
Python 95%
Data Engineering 90%
Data Visualization 85%
Machine Learning 80%
Cloud Services (AWS | GCP) 85%
Git & Docker 90%

Soft Skills

Communication
Team Collaboration
Adaptability
Time Management

About

About Me

With over 2.5 years of comprehensive experience in the field of data science & Machine learning, accompanied by a bachelor's degree in engineering. Proficient in data engineering, processing, Visualization, & machine learning. Demonstrated success in leading impactful projects, providing effective solutions and efficient pipelines.

  • Profile : Data Science & Machine Learning
  • Education : Bachelor of Engineering, VTU
  • Database : Oracle SQL, MySQL, PostgreSQL, SQLite, RDS, Redshift, DynamoDB, MongoDB, Hive, Milvus, BigQuery, Snowflake, Qdrant, Weaviate
  • Front-end  Framework : Django, Flask, Fast API, Streamlit, Gradio
  • Web  Technology : HTML, CSS, JavaScript, Bootstrap, Tailwind
  • Data  Tools : DBT, Airflow, Databricks, NumPy, Kafka, Pandas, Spark, Prometheus, Grafana, Matplotlib, Spacy, Airbyte
  • Data  Visualization : PowerBI, Tableau, Quicksight, Plotly, Seaborn
  • Machine  Learning : Fine-tuning, Supervised and Unsupervised learning, RAG Implementation, Natural Language Processing, Deep Learning
  • Trained  Models : BERT, Llama 2, Llama 3, Mistral, Falcon, Mixtral 8x7B, Gemma 7B, Claude 2, Claude 3
  • ML  Frameworks : PyTorch, Tensorflow, Keras, Scikit-Learn
  • ML  Techniques : Linear Regression, Logistic Regression, NLTK, Random forest, Naive Bayes, Clustering, EDA, Principal component analysis, RNN, CNN, KNN, Time series Forecasting, Classification
  • Cloud  Services : AWS, Google Cloud Providers, HuggingFace
  • Git  and  Services : Github, Gitlab, Bitbucket, Docker, Jenkins
  • Code  editors : VS Code, Colab, Jupyter Notebook, PyCharm
  • Other  Tools : Postman, Jira, Miro, Dbeaver, REST API, Trello

0 +  Projects completed   |   0 +  Satisfied clients

Experience

Experience

Software Development Engineer with 2.5+ years of experience building pipelines for data processing and training Large Language Models (LLM) for various different use cases. Proven expertise in data science, machine learning algorithms and project management.

Clients


Dehaze - LimitBreak
Data Engineer
2023 : 8 Months

Limit Break, a blockchain game company. Limit Break is bringing the free-to-play gaming experience to Web3 and beyond. They were founded by international pioneers in the mobile gaming business.

  • Built a pipeline to analyse ad campaigns, clickstream, and customer surveys data, to identify an increasing demand and launch a strategic product line. Built out a Druid dashboard for the visualtization and Prometheus for the metrics analysis.
  • Built out an internal tool to categorize and automate NFT giveaways based on pre-defined templates with various weightages such as Bitcoin holdings, Etherium stockpiles, Dune metrics and previous track records of participants.
  • Designed and executed A/B tests, performed rigorous statistical analysis for collaborators such as Adjust and Apsflyer based on latency, update intervals and real-time data throughput. Led to 13% MoM increase in the conversion rate.

Dehaze - BlueOcean
Data Engineer
2022 - 2023 : 6 Months

BlueOcean is a market leader in assisting decision-making with its AI-powered insights that go beyond traditional brand trackers and point solutions from Share of Voice investments to messaging strategies to outmaneuvering the competition.

  • Helped categorize and sort data as well as metrics from various sources and automated all the manual driven processes for data analysis such.
  • Built out Quicksight dashboards and set up AWS stack pipeline and Tableau for visualization and reports.
  • Improved data cleaning and preprocessing of data and optimized the data storage for faster query executions. Automated the sorting, classification and labeling of relevant data at cron schedules.
  • Worked on streaming for real-time data (approx. 230ms latency) processing pipeline, alongside Apache Airflow for efficient management of batch jobs along with job orchestration to derive actionable insights from the massive and complex datasets.

Dehaze - Fable
Data Engineer
2022 : 6 Months

Fable is an AI app for entertainment discovery and communities. It is on a mission to help build safe communities as cohorts of colleagues come together to discover, discuss, and organize books and shows as a means to learning and mental wellbeing.

  • Worked on data processing and storage from multiple sources. Converted the unstructured data into a structured and labelled format.
  • Worked on a chatbot powered by a fine-tuned llama2 Generative AI model that was trained using custom data that would suggest users a good read based on their likings, requirements and previous preferences. The bot could also give a brief summary of the book or show including a no-spoiler version to help keep the excitement of the user up. The model had an average accuracy of over 92%.
  • Set up RAG implementation so the model can retrieve relatively new data such as new books and shows from the database and mitigating any need for further fine-tuning based on the new data.
  • The introduction of Book Families and Book Series facilitated deeper exploration of thematic connections, while refinement of genre categorization enriched the catalogs search and recommendation capabilities. The ongoing maintenance processes ensured that the catalog remained current and relevant in a dynamic literary landscape.

Dehaze - Grin
Data Engineer
2022 : 4 Months

Grin is the world's first creator management platform that turns brands into household names and is listed #1 across all top review sites including Capterra, G2 Crowd and Influencer Marketing Hub and collabrated with some of the world's fastest-growing brands including SKIMS, Warby Parker, Allbirds, Mejuri, and MVMT.

  • Helped transition the data processing workflows from Xplenty to Apache Airflow. Airflow's modular and scalable architecture offered us the flexibility to design, schedule, and also monitor complex data pipelines efficiently. The migration was executed seamlessly, ensuring that no data was lost during the transition.
  • Recognized the opportunity to optimize the data querying and transformation processes. Thus decided to replace BigQuery scheduled queries with DBT cloud, a transformation tool that provided enhanced control and flexibility. This streamlined data transformations and significantly improved processing efficiency.
  • This helped reduce processing times and improve query performance. Jobs that previously took days with Xplenty were completed faster with Airflow, reducing execution by almost 7.5x time. It further enhanced Scalability as we were able to scale all our data processing capabilities effortlessly, allowing growing data volumes and diverse sources. Transitioned from BigQuery scheduled queries to DBT resulting in more reliable data transformations.



Education

Education

Here's a quick look at my academic profile.

2013   -   2015

Pre-University Board

Sheshadripuram Composite College, KSEAB

Grade : First Class

2015   -   2019

Bachelor of Engineering

Sri Krishna Institute of Technology, VTU

Grade : First Class






Badges

Badges

Explore some of the badges I've recently achieved.

Projects

Projects

Take a look at a few of my recent projects.

Phishtrap : The anti-phishing tool

Phishing attacks are the most common form of cyber attacks and have proven to be a major risk effect even to well-established enterprises. To tackle this issue, we have built out an entire pipeline to prevent phishing attacks from any workspace environment scalable upto 500+ users per node. The pipeline continuously monitors and scans new mails, identifies any malicious contents such as web links or senders, reports & blocks it directly from the server, all the while not storing any sensitive information.

S.T.A.R Project : Stock Tracker, Analysis and Results

A tool that lets you pick any stock from the Fortune-500 companies and provides you the real-time data of the stock such as highs, lows, open and close price of the day and our fine-tuned Mistral 7B AI Model gives you a smart recommendation based on this data, letting you know on whether to buy, hold or sell the particular stock. The model output is 85%+ comparable to that of GPT-4 in terms of accuracy, context and conviction scores at almost 10% cost of GPT-4 inclusive of hosting.

Break-Fix Bot : AWS Error Resolution with Intelligent Insights

Traditional AWS error resolution processes are often time-consuming and manual. Break Fix Bot addresses this challenge by automating error log retrieval across various AWS services, alerting through email and Slack messages on a desired channel as soon as an error occurs, and leveraging Antropics' latest Claude AI model for intelligent solutions. The solution also encourages interactive collaboration allowing users to engage in dynamic conversations and seek clarifications.


Sales Forecast : Time Series Forecasting

Used multiple machine learning models to forecast sales (retail store) and performed time series analysis.

Data Analysis : Insights into Enhancing customer experience

Performed exploratory data analysis and visualization on sales data to improve the customer experience to boost sales.

Customer Segmentation : Analysis using clustering model

Developed a core ML model to give various recommendations of financial products & services on target customer groups.

Having trained over 0 Large Language models
Completed over 0 Data Projects
Handled over 0 Top-tier Clients
Consistently rated
0 / 10
from various clients

Projects on Github

I love to solve data problems & build custom solutions that benefit the client needs and requirements.

Contact

Contact Me


Below are the details to reach out to me!


Have a Question for me? Let's discuss.

or