Methodology · Publishing system

How we evaluate, contextualize and publish.

How we evaluate, contextualize and publish. Transparent. Six binding chapters carry every publication: from the source through the methodology to the correction.

Chapters: 06
Status: Binding
Last updated: Apr 29, 2026

Sources

How we classify sources — and why.

A platform that evaluates AI lives on its sources. We make them visible, classify them by trust level and name their limits.

Every post that makes substantive claims about models, tools or vendors has a source. That source is linked and carries a trust level — from “primary source” to “handle with care”. Posts without a sufficient source base are not published.

We prefer primary sources (vendor releases, research papers, standards bodies) over established media, over community content. Where a source has its own interests — vendor marketing, for instance — we flag that openly. We don't cite affiliate-driven comparison sites.

For external facts the rule is: source-based, dated, linked, with a confidence note for non-primary sources. We don't reproduce external claims as our own — we point to them transparently.

We do this

Trust level (primary, established, community, caution, unknown) visible per claim.
Date of the check visible (`lastCheckedAt`).
Limitations of the source named.

We don't do this

Citing a vendor press release as market analysis.
Using community threads without further evidence as proof.
Listing affiliate comparison sites as sources.

Content

How sources become content.

We don't copy texts and we don't copy images. We paraphrase, review editorially and publish only after an approved review.

Incoming sources are captured in a moderation queue. A first summary may be AI-assisted — but it is never published automatically. Every post goes through an editorial check against the `CONTENT_POLICY.md` and only then receives the status `approved`.

Automated suggestions from the pipeline are given a confidence score. Posts with low confidence appear either with a clear notice or not at all. Hallucination risk is part of our per-post risk assessment.

We don't take over images from other outlets. We use our own diagrams, our own data graphics or sources with clear licensing.

We do this

Our own summary in our own words, original linked.
Limitations and “when not to apply” as a mandatory block.
Status chain: discovered → summarized → needs_review → approved → published.

We don't do this

Taking over original texts from other outlets.
Auto-publishing from pipelines — never.
Generic AI stock images, robot icons or “hand reaching a globe out of an AI cloud”.

Recommendations

How tool recommendations come about.

We don't recommend a “best tool in the world”. A recommendation is always tied to a specific use case and a specific audience — with methodology, alternatives and limitations.

Every recommendation answers three questions: For which use case? For which audience? Under which constraints? We don't publish answers without these fields.

Each recommendation names at least one alternative — with a short reason why it didn't take first place. That makes it traceable what we weighed. Confidence notes show how certain we are.

Our own tools or skills that we recommend carry a clear disclosure note. Affiliate links do not appear, as a matter of principle.

We do this

Use case + audience + limitations mandatory.
At least one alternative with reasoning.
Disclosure visible for our own tools/skills.

We don't do this

“Top 10” lists without methodology.
Blanket recommendations without a use case.
Rankings driven by affiliate commissions.

Allowed

Currently recommended for German-language texts, freelancers with an EU data residency requirement. Reasoning: better instruction adherence in comparison, EU hosting available (as of April 2026). Alternatives: B (cheaper, weaker style), C (stronger for legal work).

Benchmarks

How our benchmarks are built.

Reproducible or not published. Every benchmark has a dataset, a script, hardware details, repetitions and a confidence score.

Benchmarks are our most important trust surface. If you can't reproduce a result, you're right not to believe it. That's why we only publish benchmarks whose methodology is documented up front and whose scripts live in the repo.

For each benchmark we document: task, dataset (source, license, version, size, bias notes), models and versions, hardware or API-only, metrics with definitions, repetitions, variance, cost, date of measurement. Claims without these fields are not benchmark claims.

Model versions change. Existing results are not overwritten — new values appear with a new date as an additional BenchmarkResult entry.

We do this

Methodology publicly linked from every benchmark.
Confidence note and limitations mandatory.
Reproduction guide with script path and logs.

We don't do this

Cherry-picking — showing the best run.
Adopting vendors' own benchmarks 1:1.
Benchmarks without a limitations block.

Mandatory fields per benchmark

Task · dataset (version, license) · models (version, mode) · hardware or API-only · metrics · repetitions · variance · cost in € · date · reproduction guide · limitations · confidence.

Corrections

How we correct mistakes.

We make mistakes. We correct them visibly, dated and with a reason — not quietly.

When we find a factual error or one is reported to us, we correct it. The correction appears as a visible entry with date and reason in the corrections log at `/legal/korrekturen`. The original entry remains traceable.

For substantial corrections (a false statement of fact, a wrong source, a misleading recommendation) we mark the affected post with a clear notice at the top. For cosmetic corrections (typos, grammar) an entry in the log is enough.

Vendors who dispute a claim get a statement block. We check, correct if necessary and document the outcome publicly.

We do this

Corrections with date and reason.
Corrections log publicly visible.
Including statements from vendors.

We don't do this

Silent changes to the original post.
Hiding corrections or publishing them late.
Ignoring vendor complaints.

Disclosure

How we disclose our own interests.

Anyone promising methodology must disclose their own relationships. We do that proactively — even where we wouldn't have to.

Affiliate links do not appear. If we ever moved away from that, it would have to be visibly marked on every affected page and on the privacy/transparency page. As long as our methodology holds, we have no interest in affiliate.

Our own skills, workflows or tools that we recommend or sell on the platform are clearly labeled as in-house editorial. They can appear in recommendation lists, but only with visible disclosure and the same methodology as third parties.

Sponsored content is clearly separated from the editorial area, always marked as such and never able to influence editorial decisions. Advisory board or consulting relationships are made public at `/legal/transparenz`.

We do this

Marking our own skills/workflows as “in-house editorial”.
Sponsored content visibly set apart.
Advisory/consulting relationships public.

We don't do this

Native advertorials that look like editorial.
Our own skills in recommendation slots without disclosure.
Concealed relationships with vendors.

More transparency

Who we are, what we've corrected, what we disclose.

Corrections log Transparency report Back to the homepage