Testing a LLMs ability to recall information from a long context.
I created a benchmark to test how well LLMs can recall information from a long context window. It was tested on gpt-4o & claude-2.1
Handcrafted bronze maps of American terrain.
In need of a physical project, I managed 3d printing, casting in bronze, and hand finishing maps of US mountains.
Visualizing different chunking strategies.
LLMs do better with shorter context windows. I built a tool to visualize how different chunking strategies to help you pick the one that is best for your use case.