Policy Engine
Summary
This ADR defines the Management and Execution API and Workflow of the DCM Policy Engine
Motivation
The Policy Engine operates as a specialized microservice within the Data Center Management (DCM) application responsible for governing service creation and modification (e.g., VirtualMachines, Containers). It enables Admins, Tenant-Admins, and Users to inject logic that validates (Approve/Reject), mutates (Defaulting/Altering) and assigns Service Providers to request payloads using Open Policy Agent (OPA) and Rego.
Goal
Define the flow of how Policies are managed and used by the Policy Engine
- Define Policy types - Global, Tenant, User
- Define Policy management
- How policies should be added/updated
- How policies will be stored
- Define Policy execution
- Policy priority
- Value immutability and constraints
- Determine the enforcement engine and policy language
- Define the input format
- Define the output format
Non-Goals
- Policy implementation
- Actionable OpenAPI specification
- While this ADR references
TenantlevelPolicyandID,Tenantsare not supported in V1
Core Concepts & Definitions
Policy Responsibilities
Every policy may return one or more of the following outputs
- Reject: Requests are approved by default. Policies may decide whether the request should be Rejected.
- Mutation: Modifying the request payload (e.g., injecting default labels) by providing a patch map.
- Field Constraints: Defining the mutability of fields for subsequent policies in the chain.
- Service Provider Selection: Policies may set a value and/or constraints
Policy Scope & Hierarchy (Execution Order)
The execution order is strictly determined by Level first, then Priority.
- Global: (Super Admin) - Runs first.
- Tenant: (Tenant Admin) - Runs second.
- User: (End User) - Runs last.
Within each level, policies are sorted by priority: lower integers indicate higher priority.
The “Rego Contract”
Input
The input payload includes:
- The current patched request payload
- Assumption - While policies do not have to be specific for Service Types they will need to know the expected content
- The current constraints
- User information
- User ID
- Tenant ID
- The service provider (empty at first and populated while evaluating policies)
- Value
- Constraints
Output
Following the policy responsibilities, the output should be comprised of the following elements
Reject - since requests are approved by default, policies may reject them.
Service Provider -
- Value - the name of the service provider chosen to fulfill the request
- Constraints - list of allowed SPs, can take a form of Allowlist of Regex
Patch - a dictionary of the corresponding service type for setting values. Each internal key is optional
Constraints - follows JSON Schema (draft 2020-12).
This standard supports:
- Immutable: const
- Numeric constraints: minimum, maximum, multipleOf
- String patterns: pattern, minLength, maxLength
- Enumerations: enum
- Array constraints: minItems, maxItems
- Conditional logic: if/then/else
For the complete validation vocabulary, see the JSON Schema Validation specification.
Policy Code Ownership and Responsibilities
- DCM Admins, Tenant-Admins and Users implement the policies’ REGO code
- DCM Admins, Tenant-Admins and Users are responsible for correct registration of the policies
- DCM Admins, Tenant-Admins and Users are responsible for the accuracy and performance of the policies
- Trying to register a REGO code snipet that fails compilation will fail
System Architecture
The Policy API serves two distinct functions:
- Management Plane: CRUD operations for Policy definitions and synchronization with the Policy Engine.
- Execution Plane: Service requests evaluation against active policies using a stored-policy model.
Policy Management
Policy Registration Flow
sequenceDiagram
participant User
participant PolicyEngine
participant Database
participant OPA
User->>PolicyEngine: POST /api/v1/policies
PolicyEngine->>Database: Check unique Name and Priority for policy type
alt Uniqueness check failed
PolicyEngine-->>User: Error response
else Uniqueness check passed
PolicyEngine->>PolicyEngine: Generate UUID
PolicyEngine->>Database: Store policy metadata
Note right of Database: UUID, Name, ServiceType,<br/>LabelSelector, Policy Type, Priority
PolicyEngine->>OPA: Push REGO code with UUID
alt REGO compilation failed
OPA-->>PolicyEngine: Compilation error
PolicyEngine->>Database: Rollback stored metadata
PolicyEngine-->>User: Error response
else REGO compilation succeeded
OPA-->>PolicyEngine: Success
PolicyEngine-->>User: Return UUID
end
end
Pseudo API
POST /api/v1/policies
Payload
- Name
- Must be unique at its level. That is:
- All global policies must have unique names
- All tenant policies must have unique names within their tenant
- All user policies must have unique names for their user
- Must be unique at its level. That is:
- Policy Matching Criteria. Treated with AND.
- ServiceType
- Label Selector
- Policy Type
- Global, Tenant, User
- Priority
- Must be unique at its level
- A lower number means a higher priority and therefore will be evaluated first
- REGO Code
- Enabled
- Optional. Default
true
- Optional. Default
Response Payload
- Generated UUID
Execution Logic & Flow
- Validate the Policy Name and Priority
- If not unique return an error
- Generate a UUID
- Store the following information in the DB
- UUID
- Name
- Service Type
- Policy Type
- Priority
- Push the REGO code to OPA
- Use the UUID for naming to avoid collisions
- If failed, rollback DB and return an error
- Return UUID to caller
GET /api/v1/policies
Return the list of policies. Allow for filtering
GET /api/v1/policies/{policyId}
Return the specific policy
DELETE /api/v1/policies/{policyId}
Delete the specific policy
PUT /api/v1/policies/{policyId}
Update the specific policy. Policy name and type are immutable
Payload
- Policy Matching Criteria
- Priority
- REGO Code
- Enabled
Execution Plane
Sequence
sequenceDiagram
participant User
participant PlacementManager
participant PolicyEngine
participant Database
participant OPA
User->>PlacementManager: Create Service request
PlacementManager->>PolicyEngine: Validate Payload
PolicyEngine->>Database: Get matching policies by serviceType and labelSelector
Database-->>PolicyEngine: List of policies
loop For each policy
PolicyEngine->>OPA: Evaluate policy
OPA-->>PolicyEngine: Policy result
PolicyEngine->>PolicyEngine: Enforce constraints
PolicyEngine->>PolicyEngine: Mutate payload
alt Policy rejected or constraint violation
PolicyEngine-->>PlacementManager: Request rejected
PlacementManager-->>User: Request rejected
end
end
PolicyEngine-->>PlacementManager: Success with updated payload
PlacementManager-->>User: Service created
Pseudo API
POST /api/v1/engine/evaluate
Payload
- Request Payload
- User ID
- Tenant ID
Execution Logic & Flow
The Engine acts as an orchestrator. It does not send Rego code during evaluation; it calls pre-loaded modules in OPA.
Pipeline Logic (The “Chain of Responsibility”)
The Policy API maintains a
ConstraintContextmap in memory for the duration of the request.Fetch & Sort:
- Query DB for enabled policies matching the request payload based on the policy’s matching criteria.
- Sort by Level (Global -> Tenant -> User) then Priority (Desc).
If no policies matching the request payload were found, the request will return successfully
Iterate for each policy P:
- Call
OPA:- Invoke data.dcm.policy.<P.id>.result
- Pass
CurrentRequestPayloadConstraintContextUserIDTenantIDServiceProvider
- Check
Reject- If
Rejectistrue, ABORT IMMEDIATELY (Fail Fast). Return 403.
- If
- Validate
Constraints:- A lower-level policy cannot “unlock” a field locked by a higher-level policy.
- If it does, ABORT with “Policy Conflict Error”
- Update
ConstraintContext:- Merge new
Constraintsfrom Policy P intoConstraintContext.
- Merge new
- Validate
Patch:- Validate
PatchagainstConstraintContext. - Example: If
ConstraintContext.regionis immutable and Policy P tries to patch theregion, ABORT with “Policy Conflict Error”
- Validate
- Apply
Patch- Update service_payload with valid patches.
- Validate
ServiceProvider- If Policy P returned a
ServiceProviderandServiceProviderConstraintsexists, validate it.
- If Policy P returned a
- Call
Finalize: Return the final
CurrentRequestPayloadandServiceProviderto Placement Manager.
Constraint Validation Example
- Step 1 (Global Policy):
- Patch: {“billing_tag”: “engineering”}
- Constraint: {“billing_tag”: {“mode”: “immutable”}}
- Result: Payload has billing_tag. Context has billing_tag=immutable.
- Step 2 (User Policy):
- Patch: {“billing_tag”: “marketing”}
- Action: Engine checks Context. billing_tag is immutable.
- Result: Error. The User policy violates the Global constraint.