Research archive

Studies, artifacts, and prompt sweeps for people tuning models.

If the homepage helps you understand the knobs, the archive shows how those knobs behave on real tasks. This is where SampleLens publishes protocols, sweep pages, and practical notes you can borrow from.

1

live artifact with browser-side interaction

1

published benchmark protocol ready for real sweeps

3

upcoming study tracks already scoped

Live now

Start with what is already useful.

The site already gives you two useful entry points. Artifact 001 helps you build intuition for decoding. Protocol 001 shows how a real study gets structured before the results page is published.

Artifact 001

Live

Sampling filter explorer

Interactive visualization for Temperature, Top-P, Min-P, and DRY. A fast way to understand the mechanics before moving into prompt-level comparisons.

Open artifact

Protocol 001

Published

Business ideas under invariants

A constrained ideation benchmark. It defines the task, invariants, sweep matrix, rubric, and publication plan before outputs are added.

Read protocol

Editorial rules

What belongs in the archive.

The archive only works if it stays narrow, reproducible, and bluntly honest about what it has and has not shown. The goal is to help people make better decisions, not flood the internet with more AI commentary.

Publish

Real workflows, not vague opinions

Constrained ideation, offer framing, extraction, critique, and ranking are recurring tasks. Those should be the archive's unit of study.

Show

The setup, not just the conclusion

Every useful note should expose the prompt family, model, runtime, parameter range, and rubric clearly enough for a reader to challenge it.

Refuse

Generic AI content theater

If a page cannot point to a fixed setup, visible artifact, or practical takeaway, it should not be in the archive at all.

Coming next

What should follow once Protocol 001 has real outputs.

The next studies should keep the same posture: narrow question, explicit constraints, inspectable evidence, and a conclusion that helps someone tune a real workflow rather than admire a pretty graph.

Track 02

Offer framing and landing page copy

Study how different decoding settings change specificity, pressure, and trust when the goal is to write persuasive copy without drifting into hype.

Track 03

Structured extraction

Benchmark schema fidelity, verbosity, and failure modes for extraction tasks that need consistent shape more than stylistic range.

Track 04

Critique and ranking

Test when lower-variance settings help disqualify weak outputs cleanly and when some stochasticity improves coverage of hidden flaws.

Track 05

Runtime notes

Publish targeted notes about how Ollama, llama.cpp, and adjacent local runtimes expose or constrain the same controls differently.