How we evaluate, contextualize and publish.
How we evaluate, contextualize and publish. Transparent. Six binding chapters carry every publication: from the source through the methodology to the correction.
- Chapters
- 06
- Status
- Binding
- Last updated
- Apr 29, 2026
How we classify sources — and why.
A platform that evaluates AI lives on its sources. We make them visible, classify them by trust level and name their limits.
Every post that makes substantive claims about models, tools or vendors has a source. That source is linked and carries a trust level — from “primary source” to “handle with care”. Posts without a sufficient source base are not published.
We prefer primary sources (vendor releases, research papers, standards bodies) over established media, over community content. Where a source has its own interests — vendor marketing, for instance — we flag that openly. We don't cite affiliate-driven comparison sites.
For external facts the rule is: source-based, dated, linked, with a confidence note for non-primary sources. We don't reproduce external claims as our own — we point to them transparently.
We do this
- Trust level (primary, established, community, caution, unknown) visible per claim.
- Date of the check visible (`lastCheckedAt`).
- Limitations of the source named.
We don't do this
- Citing a vendor press release as market analysis.
- Using community threads without further evidence as proof.
- Listing affiliate comparison sites as sources.
How sources become content.
We don't copy texts and we don't copy images. We paraphrase, review editorially and publish only after an approved review.
Incoming sources are captured in a moderation queue. A first summary may be AI-assisted — but it is never published automatically. Every post goes through an editorial check against the `CONTENT_POLICY.md` and only then receives the status `approved`.
Automated suggestions from the pipeline are given a confidence score. Posts with low confidence appear either with a clear notice or not at all. Hallucination risk is part of our per-post risk assessment.
We don't take over images from other outlets. We use our own diagrams, our own data graphics or sources with clear licensing.
We do this
- Our own summary in our own words, original linked.
- Limitations and “when not to apply” as a mandatory block.
- Status chain: discovered → summarized → needs_review → approved → published.
We don't do this
- Taking over original texts from other outlets.
- Auto-publishing from pipelines — never.
- Generic AI stock images, robot icons or “hand reaching a globe out of an AI cloud”.
How tool recommendations come about.
We don't recommend a “best tool in the world”. A recommendation is always tied to a specific use case and a specific audience — with methodology, alternatives and limitations.
Every recommendation answers three questions: For which use case? For which audience? Under which constraints? We don't publish answers without these fields.
Each recommendation names at least one alternative — with a short reason why it didn't take first place. That makes it traceable what we weighed. Confidence notes show how certain we are.
Our own tools or skills that we recommend carry a clear disclosure note. Affiliate links do not appear, as a matter of principle.
We do this
- Use case + audience + limitations mandatory.
- At least one alternative with reasoning.
- Disclosure visible for our own tools/skills.
We don't do this
- “Top 10” lists without methodology.
- Blanket recommendations without a use case.
- Rankings driven by affiliate commissions.
Allowed
Currently recommended for German-language texts, freelancers with an EU data residency requirement. Reasoning: better instruction adherence in comparison, EU hosting available (as of April 2026). Alternatives: B (cheaper, weaker style), C (stronger for legal work).
How our benchmarks are built.
Reproducible or not published. Every benchmark has a dataset, a script, hardware details, repetitions and a confidence score.
Benchmarks are our most important trust surface. If you can't reproduce a result, you're right not to believe it. That's why we only publish benchmarks whose methodology is documented up front and whose scripts live in the repo.
For each benchmark we document: task, dataset (source, license, version, size, bias notes), models and versions, hardware or API-only, metrics with definitions, repetitions, variance, cost, date of measurement. Claims without these fields are not benchmark claims.
Model versions change. Existing results are not overwritten — new values appear with a new date as an additional BenchmarkResult entry.
We do this
- Methodology publicly linked from every benchmark.
- Confidence note and limitations mandatory.
- Reproduction guide with script path and logs.
We don't do this
- Cherry-picking — showing the best run.
- Adopting vendors' own benchmarks 1:1.
- Benchmarks without a limitations block.
Mandatory fields per benchmark
Task · dataset (version, license) · models (version, mode) · hardware or API-only · metrics · repetitions · variance · cost in € · date · reproduction guide · limitations · confidence.
How we correct mistakes.
We make mistakes. We correct them visibly, dated and with a reason — not quietly.
When we find a factual error or one is reported to us, we correct it. The correction appears as a visible entry with date and reason in the corrections log at `/legal/korrekturen`. The original entry remains traceable.
For substantial corrections (a false statement of fact, a wrong source, a misleading recommendation) we mark the affected post with a clear notice at the top. For cosmetic corrections (typos, grammar) an entry in the log is enough.
Vendors who dispute a claim get a statement block. We check, correct if necessary and document the outcome publicly.
We do this
- Corrections with date and reason.
- Corrections log publicly visible.
- Including statements from vendors.
We don't do this
- Silent changes to the original post.
- Hiding corrections or publishing them late.
- Ignoring vendor complaints.
How we disclose our own interests.
Anyone promising methodology must disclose their own relationships. We do that proactively — even where we wouldn't have to.
Affiliate links do not appear. If we ever moved away from that, it would have to be visibly marked on every affected page and on the privacy/transparency page. As long as our methodology holds, we have no interest in affiliate.
Our own skills, workflows or tools that we recommend or sell on the platform are clearly labeled as in-house editorial. They can appear in recommendation lists, but only with visible disclosure and the same methodology as third parties.
Sponsored content is clearly separated from the editorial area, always marked as such and never able to influence editorial decisions. Advisory board or consulting relationships are made public at `/legal/transparenz`.
We do this
- Marking our own skills/workflows as “in-house editorial”.
- Sponsored content visibly set apart.
- Advisory/consulting relationships public.
We don't do this
- Native advertorials that look like editorial.
- Our own skills in recommendation slots without disclosure.
- Concealed relationships with vendors.
More transparency