# Filters & domain lists

> Reusable query fragments and domain/URL lists you attach to taps to shape what they match — plus AI query generation.

Filters and domain lists are reusable building blocks that keep your rules clean. Instead of pasting
the same long `NOT domain:…` clause into every rule, define it once and apply it. Manage them all
from the [Filters](https://firehose.com/filters) page in the dashboard.

An organization can hold up to **50 filters**, and each list-type filter can hold up to **5,000
entries**.

## Query filters

A **query filter** is a named, reusable [query](/stream/rules) fragment. Define common logic once — a
junk-URL exclusion, a language qualifier, a category set — and reference it from rules instead of
repeating it.

Define a filter named `no-junk`:

```text
NOT url:/.*\/page\/[0-9]+.*/ AND NOT url:*\/category\/* AND NOT url:*\/tag\/*
```

Then reference it from a rule by name with `$`:

```text
title:tesla AND language:"en" AND $no-junk
```

Firehose expands the reference inline before the query is matched. A filter's own body is raw query
syntax — filters can't reference other filters.

## Domain and URL lists

A **domain list** or **URL list** is a named set of values that Firehose expands into a query clause
and applies to a tap. A domain list `[techcrunch.com, theverge.com]` becomes:

```text
(domain:techcrunch.com OR domain:theverge.com)
```

A URL list expands the same way against the `url` field. Attach a list to a tap in one of two modes:

- **Include** — the tap only matches pages on those domains/URLs (the clause is `AND`-ed as a positive).
- **Exclude** — the tap never matches pages on those domains/URLs (the clause is `AND NOT`-ed).

This lets you keep a curated source list (include) or a blocklist (exclude) and apply it across all of
a tap's rules without editing each rule. Add and remove entries individually, or **import** a list in
bulk.

<Callout type="warning">
  A **URL list** narrows which crawled pages a tap matches — it doesn't schedule crawls of those URLs.
  A tap matches a URL only when the crawler re-crawls it, on its own schedule. To watch specific pages
  for changes on a cadence you set, use [URL Watch](/url-watch/overview).
</Callout>

## Excluded domains

The **excluded domains** list is an organization-level blocklist. Domains on it are suppressed across
all of your rules, so you don't have to add the same `NOT domain:…` clause to each one. Use it for
sources you never want to hear from again — content farms, aggregators, or known-noisy sites.

Manage it from the [Excluded domains](https://firehose.com/excluded-domains) page. You can **import**
and **export** in bulk, including the Google Disavow file format, so an existing blocklist can be
reused.

<Callout type="info">
  Excluded domains apply at the organization level — to every tap. For a list that should only affect
  one tap, attach an **exclude**-mode domain list to that tap instead.
</Callout>

## AI query generation

Writing queries by hand has a learning curve, so Firehose can generate one from a plain-language
description. In the rule or filter editor, choose **Generate query** and describe your intent — for
example, *"news articles about electric vehicles in English from the last day, excluding pagination
pages."* Firehose produces a validated query plus a short explanation:

```text
"EV news in English, last 24h, no pagination"
        │  generate
        ▼
title:"electric vehicle" AND language:"en" AND recent:24h
  AND NOT url:/.*\/page\/[0-9]+.*/
```

<Callout type="tip">
  Treat the output as a strong first draft. The generator knows the field vocabulary, but you know
  your use case — adjust fields, add a query filter, or tighten the recency window before saving.
</Callout>

## Next steps

<CardGrid>
  <Card title="Rules & query syntax" href="/stream/rules">
    The grammar your filters and rules are written in.
  </Card>
  <Card title="The live feed" href="/dashboard/feed">
    Watch what your taps match in the dashboard.
  </Card>
</CardGrid>
