Tutorial

Deep Data Collection

Crawl websites, extract structured records into reusable datasets, share public data with the community, and earn token credits.

What is Deep Collection?

Deep collection is an automation mode that crawls specified URLs, extracts structured records from each page (e.g. property listings, product details, job postings), and stores them incrementally in your organization’s vector database.

Unlike a normal widget run that returns a single report, deep collection produces a growing dataset that you can query, chart, and feed into downstream widgets.

Tip:Deep collection requires a Standard plan or above. Free and Starter plans do not have access to this feature.

Creating a Collection Widget

Open the Automations page and click New Widget. Configure the widget as usual (name, questions, schedule), then expand the Deep Data Collection section and enable the toggle.

Crawl URLs — Enter one URL per line. These are the starting points the crawler will follow.
Pages to view — How many pages to visit per crawl URL. Your plan sets the upper limit.
Items to collect — Maximum records to extract. Your plan may cap this value.
Time limit — How long the crawler is allowed to run (1–30 minutes).

Viewing Collected Data

After the widget runs, open the widget detail panel. The report shows a markdown table of extracted records. Each row was stored as a vector-indexed chunk for future recall.

You can also ask questions about the collected data in a chat conversation — Elis will recall relevant chunks automatically.

Sharing Data

To make your dataset available to the community, set the widget’s Visibility to public (requires Pro plan or above).

Only data classified as public_web (from web search and crawl tools) is shared. Data from SQL queries, knowledge bases, or document search is always classified as private and never leaves your organization.

Tip:When other organizations use your shared data, you earn token credits automatically. Check your token transaction history to see earned credits.

Using Shared Data

When you ask a question in chat, Elis searches both your organization’s data and the community shared pool. Shared results are tagged with [SHARED] and ranked slightly lower than your own data.

Each plan has a daily limit on shared data rows. When the limit is reached, you’ll see aggregate summaries instead of full rows, along with a suggestion to collect the data yourself using web crawl.