Data Structures¶

This document describes the core data structures used throughout the Autograder system.

Criteria Tree¶

The criteria tree represents the grading rubric structure. It mirrors how educators think about grading with hierarchical organization.

Structure¶

CriteriaTree
├── Base (weight: 100)
│   ├── Subject: Functionality (weight: 60)
│   │   ├── Test: Correct Output (weight: 100)
│   │   └── Test: Edge Cases (weight: 100)
│   └── Subject: Code Quality (weight: 40)
│       ├── Test: Proper Syntax (weight: 50)
│       └── Test: Good Practices (weight: 50)
├── Bonus (weight: 10)
│   └── Test: Extra Features
└── Penalty (weight: -20)
    └── Test: Late Submission

Categories¶

Base Category¶

Purpose: Core grading criteria
Weight: Typically 100 (represents 100% of base grade)
Contains: Subjects and/or tests
Required: Yes

Bonus Category¶

Purpose: Extra credit opportunities
Weight: Additional percentage (e.g., 10 = up to 10% extra)
Contains: Tests only (no subject grouping)
Required: No

Penalty Category¶

Purpose: Deductions from total score
Weight: Negative percentage (e.g., -20 = up to 20% deduction)
Contains: Tests only (no subject grouping)
Required: No

Subjects¶

Subjects group related tests together with weighted importance.

Properties: - subject_name: Human-readable name - weight: Relative importance (0-100) - tests: List of test configurations - subjects: Nested subjects (for hierarchical organization)

Example:

{
  "subject_name": "HTML Structure",
  "weight": 40,
  "tests": [
    {
      "name": "has_tag",
      "parameters": {"tag": "header"},
      "weight": 50
    }
  ]
}

Tests¶

Individual grading checks within subjects or categories.

Properties: - name: Test function name from template library - parameters: Test-specific configuration - weight: Relative importance within parent (0-100)

Example:

{
  "name": "expect_output",
  "parameters": {
    "inputs": ["Alice"],
    "expected_output": "Hello, Alice!",
    "program_command": "python3 greeting.py"
  },
  "weight": 100
}

Result Tree¶

The result tree is the output structure after grading. It mirrors the criteria tree but includes actual scores and reports.

Structure¶

{
  "final_score": 85.5,
  "root": {
    "base": {
      "name": "base",
      "score": 85.5,
      "weight": 100,
      "max_score": 100,
      "subjects": [
        {
          "name": "Functionality",
          "score": 90,
          "weight": 60,
          "max_score": 60,
          "tests": [
            {
              "name": "expect_output",
              "score": 100,
              "weight": 100,
              "max_score": 100,
              "report": "Test passed successfully!",
              "passed": true
            }
          ]
        },
        {
          "name": "Code Quality",
          "score": 80,
          "weight": 40,
          "max_score": 40,
          "tests": [
            {
              "name": "check_syntax",
              "score": 80,
              "weight": 100,
              "max_score": 100,
              "report": "Minor style issues detected",
              "passed": false
            }
          ]
        }
      ]
    },
    "bonus": {
      "name": "bonus",
      "score": 5,
      "weight": 10,
      "max_score": 10,
      "tests": [
        {
          "name": "extra_features",
          "score": 50,
          "weight": 100,
          "max_score": 100,
          "report": "Some extra features implemented",
          "passed": false
        }
      ]
    },
    "penalty": {
      "name": "penalty",
      "score": 0,
      "weight": -20,
      "max_score": 0,
      "tests": []
    }
  }
}

Score Calculation¶

Scores are calculated hierarchically:

Test Level: Raw score (0-100) based on test execution
Subject Level: Weighted average of all tests within subject
Category Level: Weighted average of all subjects (or direct tests)
Final Score: base_score + bonus_score - penalty_score

Example Calculation:

Test 1: 100 (weight: 50) = 50 points
Test 2: 80 (weight: 50) = 40 points
Subject Score: (50 + 40) / 100 = 90

Subject 1: 90 (weight: 60) = 54 points
Subject 2: 80 (weight: 40) = 32 points
Base Score: (54 + 32) / 100 = 86

Bonus Score: 5 (from 10 max)
Penalty Score: 0

Final Score: 86 + 5 - 0 = 91

Grading Result¶

The complete grading result object returned after pipeline execution.

Properties¶

class GradingResult:
    final_score: float                     # Overall grade (0-100+)
    feedback: Optional[str] = None         # Generated feedback (if enabled)
    result_tree: Optional[ResultTree] = None  # Detailed score breakdown
    focus: Optional[Focus] = None          # Focus object organizing tests by impact
    error: Optional[str] = None            # Error message (if grading failed)
    failed_at_step: Optional[str] = None   # Pipeline step where failure occurred

Example¶

{
  "final_score": 91.0,
  "feedback": "Great work! Your program meets most requirements...",
  "result_tree": {
    "final_score": 91.0,
    "root": { ... }
  },
  "focus": {
    "high_impact": [ ... ],
    "medium_impact": [ ... ],
    "low_impact": [ ... ]
  },
  "error": null,
  "failed_at_step": null
}

Pipeline Execution¶

Internal state object passed between pipeline steps and used to generate API responses.

Properties¶

class PipelineExecution:
    submission: Submission           # Student files and metadata
    step_results: list[StepResult]  # Results from each pipeline step
    assignment_id: int              # Assignment identifier
    status: PipelineStatus          # Overall execution status (SUCCESS, FAILED, RUNNING)
    result: GradingResult | None    # Final grading result (if grading completed)
    start_time: float               # Execution start timestamp

Pipeline Execution Summary¶

The PipelineExecutionSerializer (autograder/serializers/pipeline_execution_serializer.py) handles the conversion of an execution into a dictionary for API responses.

Step Status Semantics¶

During execution, steps return a StepResult which contains a status: - SUCCESS: The step executed correctly and produced the expected output data. - FAIL: The step executed correctly, but encountered a logical domain failure (e.g. pre-flight file check missing required files, or code styling was bad). - INTERRUPTED: The step encountered an unexpected system-level error or unhandled exception. The pipeline base class traps this to prevent full collapse.

`PipelineExecutionSerializer.serialize(execution) -> dict`¶

Generates detailed summary for API responses.

Returns:

{
    "status": "success" | "failed",
    "failed_at_step": str | None,
    "total_steps_planned": int,
    "steps_completed": int,
    "execution_time_ms": int,
    "steps": [
        {
            "name": "PreFlightStep",
            "status": "success" | "fail",
            "message": str,
            "error_details": dict | None  # Present if step failed
        }
    ]
}

Example Summary (Success)¶

{
  "status": "success",
  "failed_at_step": null,
  "total_steps_planned": 8,
  "steps_completed": 8,
  "execution_time_ms": 4521,
  "steps": [
    {"name": "LOAD_TEMPLATE", "status": "success"},
    {"name": "PRE_FLIGHT", "status": "success", "message": "Required files found"},
    {"name": "SANDBOX", "status": "success", "message": "Sandbox environment ready"},
    {"name": "GRADE", "status": "success", "message": "Grading completed: 85.5/100"}
  ]
}

Example Summary (Preflight Failure)¶

{
  "status": "failed",
  "failed_at_step": "SANDBOX",
  "total_steps_planned": 8,
  "steps_completed": 4,
  "execution_time_ms": 1523,
  "steps": [
    {"name": "LOAD_TEMPLATE", "status": "success"},
    {"name": "BUILD_TREE", "status": "success"},
    {"name": "PRE_FLIGHT", "status": "success"},
    {
      "name": "SANDBOX",
      "status": "fail",
      "message": "Setup command 'Compile Calculator.java' failed",
      "error_details": {
        "error_type": "setup_command_failed",
        "phase": "setup_commands",
        "command_name": "Compile Calculator.java",
        "failed_command": {
          "command": "javac Calculator.java",
          "exit_code": 1,
          "stderr": "Calculator.java:4: error: ';' expected..."
        }
      }
    }
  ]
}

Test Result¶

Individual test execution result.

Properties¶

class TestResult:
    test_name: str                            # Name of the test function
    score: float                              # Score achieved (0-100)
    report: str                               # Human-readable result message
    subject_name: str = ""                    # Parent subject name
    parameters: Optional[Dict[str, Any]]      # Test parameters used

Example¶

{
  "test_name": "expect_output",
  "score": 100,
  "report": "Output matched expected value 'Hello, Alice!'",
  "subject_name": "Functionality",
  "parameters": {
    "inputs": ["Alice"],
    "expected_output": "Hello, Alice!",
    "program_command": "python3 main.py"
  }
}

Submission¶

Student submission data.

Properties¶

class Submission:
    username: str                                   # Student identifier
    user_id: int                                    # Student ID
    assignment_id: int                              # Assignment identifier
    submission_files: Dict[str, SubmissionFile]      # Uploaded files keyed by filename
    language: Optional[Language] = None             # Programming language

Submission File¶

class SubmissionFile:
    filename: str      # Name of file
    content: str       # File contents (text)

Template¶

Template provides test functions for a specific assignment type.

Properties¶

class Template:
    name: str                           # Template identifier
    tests: dict[str, TestFunction]      # Available test functions
    requires_sandbox: bool              # Whether code execution needed
    supported_languages: list[str]      # Compatible languages

Test Function¶

class TestFunction:
    name: str                          # Function identifier

    def execute(
        self,
        files: dict[str, str],
        sandbox: SandboxContainer | None,
        **kwargs
    ) -> TestResult:
        """Execute the test and return result"""
        pass

Data Structures¶

Criteria Tree¶

Structure¶

Categories¶

Base Category¶

Bonus Category¶

Penalty Category¶

Subjects¶

Tests¶

Result Tree¶

Structure¶

Score Calculation¶

Grading Result¶

Properties¶

Example¶

Pipeline Execution¶

Properties¶

Pipeline Execution Summary¶

Step Status Semantics¶

PipelineExecutionSerializer.serialize(execution) -> dict¶

Example Summary (Success)¶

Example Summary (Preflight Failure)¶

Test Result¶

Properties¶

Example¶

Submission¶

Properties¶

Submission File¶

Template¶

Properties¶

Test Function¶

`PipelineExecutionSerializer.serialize(execution) -> dict`¶