API reference¶

Dead Simple Search provides a REST API — a way for programs to communicate over HTTP (the same protocol your browser uses). All requests and responses use JSON, a lightweight data format that's easy to read and write.

The API runs on port 5555 by default. There is no authentication — anyone who can reach the server can use the API. Keep this in mind when deciding where to deploy.

Sites¶

Sites represent the websites you want to crawl and search. You register a site by providing its domain name and a starting URL.

Register a new site¶

POST /api/sites

Request body:

{
  "domain": "example.com",
  "start_url": "https://example.com/"
}

Field	Type	Required	Description
`domain`	string	Yes	The domain name, e.g. `example.com`
`start_url`	string	Yes	The URL where the crawler should begin

Response (201 Created):

{
  "id": 1,
  "domain": "example.com",
  "start_url": "https://example.com/"
}

If the domain is already registered, the API returns 409 Conflict.

List all sites¶

GET /api/sites

Returns an array of all registered sites.

Get site details¶

GET /api/sites/{id}

Returns the site details including a page_count field showing how many pages have been indexed.

Delete a site¶

DELETE /api/sites/{id}

Removes the site and all its indexed pages. This action cascades (meaning it also removes all related data, including pages and crawl logs). Returns 204 No Content on success.

Crawling¶

Crawling is the process of visiting each page on a website, reading its content, and storing it for searching later.

Trigger a crawl¶

POST /api/sites/{id}/crawl

Starts a crawl in the background. The API responds immediately with 202 Accepted — it doesn't wait for the crawl to finish.

Response:

{
  "message": "Crawl started.",
  "site_id": 1
}

Check crawl status¶

GET /api/sites/{id}/crawl/status

Returns the most recent crawl log entries for the site.

Parameter	Type	Default	Description
`limit`	int	5	Number of log entries to return

Response:

[
  {
    "id": 1,
    "started_at": "2026-02-16T10:30:00",
    "finished_at": "2026-02-16T10:35:42",
    "pages_crawled": 127,
    "pages_failed": 3,
    "status": "completed"
  }
]

The status field is one of: running, completed, or failed.

Search¶

Search within a site¶

GET /api/sites/{id}/search?q=your+query

Parameter	Type	Default	Description
`q`	string	—	Required. The search query.
`lang`	string	—	Filter results by language code (e.g. `en`, `sv`, `de`).
`limit`	int	20	Results per page. Maximum 100.
`offset`	int	0	Number of results to skip (for pagination).

Response:

{
  "site_id": 1,
  "query": "python tutorial",
  "language_filter": null,
  "total": 42,
  "limit": 20,
  "offset": 0,
  "results": [
    {
      "page_id": 123,
      "url": "https://example.com/python-intro",
      "title": "Python Introduction",
      "meta_description": "Learn Python basics...",
      "h1": "Getting Started with Python",
      "language": "en",
      "relevance": 12.45,
      "snippet": "Python is a versatile programming language..."
    }
  ]
}

How relevance works: The relevance score is calculated by MySQL's full-text search engine. Higher numbers mean a better match. Results are always sorted by relevance, best matches first.

Search modes: Dead Simple Search automatically detects whether to use "natural language" mode or "boolean" mode. If your query contains special characters like +, -, *, or " (quotation marks), it switches to boolean mode, which gives you more control:

Operator	Meaning	Example
`+`	Word must be present	`+python +tutorial`
`-`	Word must not be present	`python -java`
`*`	Wildcard (matches word beginnings)	`progr*`
`"..."`	Exact phrase match	`"getting started"`

Error responses¶

When something goes wrong, the API returns a JSON error:

{
  "error": "Not Found",
  "message": "Site not found."
}

Status code	Meaning
`400`	Bad request — missing or invalid parameters
`404`	Not found — the site or resource doesn't exist
`409`	Conflict — e.g. trying to register a duplicate site