Pipeline Architecture¶
What it is¶
The Autograder pipeline is the execution backbone that turns one submission into a complete grading result.
Every run follows a fixed order built by build_pipeline():
LOAD_TEMPLATE -> BUILD_TREE -> SANDBOX -> PRE_FLIGHT -> AI_BATCH -> GRADE -> FOCUS -> FEEDBACK? -> EXPORT?
FEEDBACK and EXPORT are optional, based on configuration.
Why it matters¶
- It keeps grading deterministic and debuggable.
- It separates concerns: each step has one responsibility.
- It makes extension safer: new behaviors can be added as steps, not scattered conditionals.
How it works¶
AutograderPipeline.run(submission)creates aPipelineExecution.- Each step receives the same
PipelineExecutionand appends oneStepResult. - If a step fails, execution stops early.
finish_execution()assemblesGradingResultfrom grade/focus/feedback artifacts.- Sandbox cleanup runs at the end and destroys any sandbox used by the submission.
Core data contract¶
PipelineExecution is the shared contract between steps:
- Submission data (
submission) - Ordered step outputs (
step_results) - Runtime status (
status) - Final result (
result)
Typed accessors such as get_loaded_template(), get_built_criteria_tree(), get_result_tree(), and get_focus() prevent ad-hoc data access in step implementations.
Example mental model¶
Think of the pipeline as a production line:
- Load Template chooses the toolset
- Build Tree builds the rubric structure
- Sandbox + Pre-Flight prepares a safe execution environment
- Grade produces raw scoring results
- Focus + Feedback convert scoring into learning guidance
Common mistakes¶
- Treating steps as independent services without respecting step order dependencies
- Generating feedback without focus data
- Documenting only successful flow and skipping fail-fast behavior