Technical Posts and Projects

Last Updated date: 12th July, 2024

Hello, I am Sankalp. I worked as a backend engineer at a US based fintech company for the past 2 years.

Apart from general software development, I have a strong foundation in deep learning. These days, I am in the process catching up with LLMs both at the applied layer as well as the foundational/architectural level of things.

I am currently open to work as applied ML/MLE positions both FTE and/or contract positions. If you are hiring, contact me on X or throw a mail at hgirl3078@gmail.com for more details as this page does not give the full picture.

Technical Posts

Projects

VimSonnet

VimSonnet was a small project where I tried to control my browser using Claude 3.5 Sonnet as agents. Github

The approach in the demo uses Vimium (a Chrome extension that generates hint tags for all UI elements in a browser) along with Pyautogui and Claude 3.5 Sonnet as the LLM.

I had tried a screen coodinate based approach but Claude's Vision cannot point out coordinates by itself so it was required to use hint tags.

all i wanted to do was post a tweet by claude and that has been done. the screen coordinate approach doesn't work unless your vision model supports coordinates (some can do, some getting cooked). this approach uses vimium hint tags + pyautogui + 3.5 sonnet function calls https://t.co/w32UBKBM60 pic.twitter.com/8tCYDMLEhT
— sankalp (@dejavucoder) June 30, 2024

CodeQA

A chat-with-a-codebase project, CodeQA allows users to search the codebase and get relevant files, code snippets by asking questions in English (natural language). It supports Python, Rust, Javascript and java. It provides a minimal UI for easy interaction.

It utilizes tree-sitter to generate AST to construct a codebase index before generating embeddings. We do a top-K RAG (Retrieval Augmented Generation) using these embeddings along with more post-retrieval techniques. I wrote a couple of blogposts (that are listed in the above section) for CodeQA.

codeQA

Demo:

a short demo of codeQA (sped up 2x)
i try chatting with @ vikhyatk's moondream repo.
i ask 'what is the repo about', tell about visionencoder models, show code, provide references (visionTransformer wrapped in EncoderWrapper module)
song: potsu - take me there pic.twitter.com/3nFapfy471
— sankalp (@dejavucoder) May 15, 2024

Twitter Circle

GitHub Repository

A tool to visualize your Twitter network and direct messaging history.

Make a Twitter Circle visualization for up to 200 users.
Check leaderboard based on combined weights of all your mentions of other users and all direct messages.
Check DM stats message count per recipient, messages sent/received per user, total messages, last message with them
DM bar graph where you can see messages/month for 5 years data.

This project has been used for 200+ people by now and recieved 100+ stars.

https://t.co/gVQAaUelSC pic.twitter.com/EKRQYPkjDC
— sankalp (@dejavucoder) June 14, 2024

SemanTweet Search

GitHub Repository

SemanTweet Search allows you to search over all your tweets from the Twitter archive using semantic similarity. A demo is available here.

It preprocesses your tweets, generates embeddings using OpenAI's small/large embedding model, stores the data and embeddings in LanceDB vector db, and provides a web interface to search and view the results.

You can do semantic search post pre-filtering by time, likes, retweets, media only or link only tweets too.

Pre-filtering by sql operations helps not only filter but also reduce the vector search space thus speeding up the search.

You can additionally use/edit projector.py and tensorflow projector to get a visualization of your tweets using t-sne algorithm as shown here

Demo Thread Visualization Demo

Handpicked by Haiku

GitHub Repository

A LLM based app that generates personalized recommendations and visualizations based on user input, leveraging Claude Haiku's knowledge base.

CaptionBot

GitHub Repository

This is my Pytorch implementation of the paper "Show, Attend and Tell".

It generates descriptive image captions for your images. Architecture involves Seq2Seq-based image captioning with attention mechanism, utilizing ResNet-50 as encoder to extract meaningful image features and an LSTM-based decoder with soft-attention and Beam Search. It achieved BLEU-1 score of 59.2 and BLEU-4 score of 19.56, closely matching the original benchmarks

sankalp's blog

Technical Posts

Popular non-technical posts

Projects

VimSonnet

CodeQA

Twitter Circle

SemanTweet Search

Handpicked by Haiku

CaptionBot