Artifact 001
Sampling filter explorer
Interactive visualization for Temperature, Top-P, Min-P, and DRY. A fast way to understand the mechanics before moving into prompt-level comparisons.
Open artifactResearch archive
If the homepage helps you understand the knobs, the archive shows how those knobs behave on real tasks. This is where SampleLens publishes protocols, sweep pages, and practical notes you can borrow from.
live artifact with browser-side interaction
published benchmark protocol ready for real sweeps
upcoming study tracks already scoped
Live now
The site already gives you two useful entry points. Artifact 001 helps you build intuition for decoding. Protocol 001 shows how a real study gets structured before the results page is published.
Artifact 001
Interactive visualization for Temperature, Top-P, Min-P, and DRY. A fast way to understand the mechanics before moving into prompt-level comparisons.
Open artifactProtocol 001
A constrained ideation benchmark. It defines the task, invariants, sweep matrix, rubric, and publication plan before outputs are added.
Read protocolEditorial rules
The archive only works if it stays narrow, reproducible, and bluntly honest about what it has and has not shown. The goal is to help people make better decisions, not flood the internet with more AI commentary.
Publish
Constrained ideation, offer framing, extraction, critique, and ranking are recurring tasks. Those should be the archive's unit of study.
Show
Every useful note should expose the prompt family, model, runtime, parameter range, and rubric clearly enough for a reader to challenge it.
Refuse
If a page cannot point to a fixed setup, visible artifact, or practical takeaway, it should not be in the archive at all.
Coming next
The next studies should keep the same posture: narrow question, explicit constraints, inspectable evidence, and a conclusion that helps someone tune a real workflow rather than admire a pretty graph.
Track 02
Study how different decoding settings change specificity, pressure, and trust when the goal is to write persuasive copy without drifting into hype.
Track 03
Benchmark schema fidelity, verbosity, and failure modes for extraction tasks that need consistent shape more than stylistic range.
Track 04
Test when lower-variance settings help disqualify weak outputs cleanly and when some stochasticity improves coverage of hidden flaws.
Track 05
Publish targeted notes about how Ollama, llama.cpp, and adjacent local runtimes expose or constrain the same controls differently.