Anantvir Singh – Medium

Anantvir Singh

Home

About

Anantvir Singh

Building a Next-Gen Customer Service Agent using Langgraph: LLMs, ReAct and Autonomous Problem…

Implementation can be found here.

Oct 9

Building a Next-Gen Customer Service Agent using Langgraph: LLMs, ReAct and Autonomous Problem…

Oct 9

Anantvir Singh

Finetuning IBM Granite models using RLHF in Watsonx.ai

Proximal Policy Optimization in RLHF implementation from scratch using HuggingFace TRL.

Sep 28

Finetuning IBM Granite models using RLHF in Watsonx.ai

Sep 28

Anantvir Singh

Finetuning IBM Granite models using RLHF in Watsonx.ai — Part 1 : Reward Modeling

What you’ll learn : 1. RLHF Overview How to implement Reward Modeling 2. How to fine tune open source models using RLHF on custom datasets

Sep 27

Finetuning IBM Granite models using RLHF in Watsonx.ai — Part 1 : Reward Modeling

Sep 27

Anantvir Singh

Anantvir Singh

Pre-sales Solutions Architect at IBM

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams