This page summarises shepard’s architecture as it exists today. Sources:
aidocs/01-repo-overview.md, architecture/src/05_building_block_view/,
architecture/src/09_architecture_decisions/, infrastructure/docker-compose.yml.
High-level block diagram
metadata graph)] backend --> mongo[(MongoDB 8.0
files and structured)] backend --> ts[(Postgres + TimescaleDB
timeseries)] backend -. optional .- gis[(Postgres + PostGIS
spatial)] backend -. SPARQL .- semantic[Semantic repos
e.g. Ontobee] backend -. webhooks .- subs[Subscribers] oidc[OIDC IdP - Keycloak typical] --> caddy prom[Prometheus] --> backend
Stack
- Backend — Quarkus 3.27.x, Java 21, Maven; JJWT 0.11.5; Hibernate Spatial 7.2.6.
Verified in
backend/pom.xml(<maven.compiler.release>21</maven.compiler.release>). - Frontend — Nuxt 3, Vue 3, Vuetify 3, TipTap, Vite 6.
- Build tooling — OpenAPI Generator for the polyglot clients; Renovate for dependency updates; docToolchain for HTML architecture docs.
Polyglot persistence — and why each store
| Store | Role | Reason |
|---|---|---|
| Neo4j 5.24 | Metadata graph | The Collection / DataObject / Reference / Container relationships are inherently graph-shaped; parent/child + predecessor/successor traversals are first-class. |
| MongoDB 8.0 | Files, structured documents | Variable-shaped JSON payloads and binary files do not benefit from a relational schema; MongoDB’s document model is the natural fit. |
| Postgres + TimescaleDB | Timeseries | Hypertables, time-bucket aggregation, and SQL-compatible ingestion outperformed InfluxDB for the workload (ADR-010 / ADR-011). |
| Postgres + PostGIS (optional) | Spatial data | Bounding-box queries returned in 380 ms versus 59 s on alternative stacks (ADR-014 / ADR-017). Behind the shepard.spatial-data.enabled feature flag. |
ADR rationale: see architecture/src/09_architecture_decisions/008-...,
010-..., 011-..., 014-..., 017-....
Entity model
The four entity kinds, per architecture/src/05_building_block_view/ and the
data-model wiki page:
- Collection — top-level container; permissioned root for a workpackage, experiment, or campaign.
- DataObject — node within a Collection, related parent/child (composition) and predecessor/successor (derivation). Holds References.
- References — typed pointers from a DataObject to data: Structured-Data, File, Timeseries, Spatial-Data, URI, Lab-Journal, plus inter-entity References (Collection, DataObject).
- Containers — the actual payload store for Timeseries, Structured-Data, and File data; addressed by Reference.
Concrete *Rest.java endpoints live under
backend/src/main/java/de/dlr/shepard/{context,data,auth,common}/.../endpoints/
— e.g. CollectionRest, DataObjectRest, FileReferenceRest,
TimeseriesRest, StructuredDataRest, SpatialDataPointRest,
LabJournalEntryRest, URIReferenceRest,
SemanticRepositoryRest, SubscriptionRest,
UserRest, UserGroupRest, ApiKeyRest, SearchRest.
Auth model
- Inbound — JWT bearer tokens from an external OIDC provider (Keycloak in the typical deployment), validated against a pinned static OIDC public key.
- Long-lived access —
X-API-KEYheader (verified in CORS allowlistquarkus.http.cors.headersinapplication.properties). - Authorization — per-entity
Permissionsgraph (Neo4j) with Owner / Manager / Writer / Reader roles plus group-level Reader/Writer; entity visibility flag is one ofPublic,PublicReadable,Private(backend/src/main/java/de/dlr/shepard/common/util/PermissionType.java).
For deeper material, see the upstream architecture chapters under
architecture/src/. Cross-references to the in-repo design notes
aidocs/12 §11, aidocs/13, and aidocs/14 are marked planned / proposal;
some forward-referenced aidocs (e.g. aidocs/19, aidocs/24) are not yet
checked in at this snapshot date.
Export model
RO-Crate ZIP export with ro-crate-metadata.json, per aidocs/01. This is the
canonical machine-readable handover format; consumers can re-attach the export
into another shepard instance or process it with the wider RO-Crate toolchain.
Observability
- Prometheus metrics at
/shepard/doc/metrics/prometheus(quarkus.micrometer.export.prometheus.enabled=true). - Health endpoints under
/shepard/api/healthz(quarkus.smallrye-health.root-path).
Where this site sits
This site does not duplicate the canonical Arc42 docs in architecture/src/.
For decision rationale and module decomposition, read those AsciiDoc sources
(or the rendered docToolchain output) directly.