Coordimap Configuration Guide
Learn how to configure the Coordimap agent, choose the right scope_id, and keep cloud, Kubernetes, and database asset identities stable.
Coordimap Configuration Overview
The Coordimap agent reads a YAML configuration file that tells it which systems to crawl and how those systems should be identified.
At a minimum, each data source configuration should answer two questions:
- Which Coordimap connector record should this crawler write into?
- Which real upstream system owns the assets being discovered?
Important Note About id vs data_source_id
If you look at some current agent repository examples, you may still see datasource examples written with id.
For Coordimap docs, prefer data_source_id when describing the agent configuration contract because that is the field used in the runtime payload model and in current validation and error paths.
If you are working from older examples, verify the expected field in your deployed agent version before copying them directly into production configuration.
How Identity Works In Coordimap
Coordimap uses two different identifiers for two different jobs:
data_source_ididentifies the data source record you created in the Coordimap UI.scope_ididentifies the real upstream ownership boundary for the assets.
This is the most important configuration detail to get right.
If you recreate a data source in Coordimap but keep the same scope_id, Coordimap can continue to treat the discovered assets as the same infrastructure. If scope_id changes when the upstream system did not, you usually end up with duplicate assets, broken references, or flow mappings that no longer attach cleanly.
Read the full reference here: Shared Configuration Options.
Recommended scope_id Values
- Kubernetes: cluster UID
- GCP: project number
- AWS: account ID
- PostgreSQL: system identifier
- MySQL or MariaDB: server UUID
- MongoDB: replica set or cluster identity
A Minimal Agent Configuration Example
coordimap:
api_key: ${COORDIMAP_API_KEY}
data_sources:
- type: kubernetes
data_source_id: <YOUR_DATASOURCE_ID_FROM_UI>
config:
- name: scope_id
value: "<YOUR_STABLE_UPSTREAM_ID>"
- name: crawl_interval
value: "5m"Data Source Guides
- Shared Configuration Options
- Agent Runtime Options
- Metric Trigger Rules
- Kubernetes Configuration
- Google Cloud Platform Configuration
- AWS Configuration
- AWS Flow Logs Configuration
- PostgreSQL Configuration
- MySQL and MariaDB Configuration
- MongoDB Configuration
Flow And Metrics Guides
- GCP VPC Flow Logs
- AWS Flow Logs Configuration
- AWS Flow Logs To S3
- eBPF Flows Configuration
- Metric Trigger Rules
Environment Variables
Use environment variables for secrets and credentials whenever possible.
Examples:
COORDIMAP_API_KEYAWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYPOSTGRES_PASSWORD
That keeps your configuration portable and prevents secrets from being committed into source control.
Source References
FAQ
What is the most important Coordimap agent configuration field?
The most important identity field is scope_id. It should come from the upstream system being crawled, such as a Kubernetes cluster UID, AWS account ID, GCP project number, database system identifier, or MongoDB replica set identity.
Is data_source_id the same as scope_id?
No. data_source_id identifies the connector record in Coordimap. scope_id identifies the real upstream system that owns the discovered assets.
Can one agent configuration contain multiple data sources?
Yes. The coordimap.data_sources list can contain multiple data source blocks for Kubernetes, cloud accounts, databases, flow logs, and eBPF flows.
Should credentials be stored directly in config.yaml?
No. Prefer environment variables, Kubernetes Secrets, IAM roles, service accounts, or a secret manager instead of hard-coding credentials in the YAML file.
Kubernetes YAML Manifest
Deploy the Coordimap agent with a raw Kubernetes manifest, including RBAC, Deployment, ConfigMap, and stable scope_id examples.
Shared Configuration Options
Learn how Coordimap shared configuration works, including data_source_id, scope_id, and crawl_interval for stable asset identity across cloud and database crawlers.