Skip to content

Data Structures

This document describes the core data structures used throughout the Autograder system.

Criteria Tree

The criteria tree represents the grading rubric structure. It mirrors how educators think about grading with hierarchical organization.

Structure

CriteriaTree
├── Base (weight: 100)
│   ├── Subject: Functionality (weight: 60)
│   │   ├── Test: Correct Output (weight: 100)
│   │   └── Test: Edge Cases (weight: 100)
│   └── Subject: Code Quality (weight: 40)
│       ├── Test: Proper Syntax (weight: 50)
│       └── Test: Good Practices (weight: 50)
├── Bonus (weight: 10)
│   └── Test: Extra Features
└── Penalty (weight: -20)
    └── Test: Late Submission

Categories

Base Category

  • Purpose: Core grading criteria
  • Weight: Typically 100 (represents 100% of base grade)
  • Contains: Subjects and/or tests
  • Required: Yes

Bonus Category

  • Purpose: Extra credit opportunities
  • Weight: Additional percentage (e.g., 10 = up to 10% extra)
  • Contains: Tests only (no subject grouping)
  • Required: No

Penalty Category

  • Purpose: Deductions from total score
  • Weight: Negative percentage (e.g., -20 = up to 20% deduction)
  • Contains: Tests only (no subject grouping)
  • Required: No

Subjects

Subjects group related tests together with weighted importance.

Properties: - subject_name: Human-readable name - weight: Relative importance (0-100) - tests: List of test configurations - subjects: Nested subjects (for hierarchical organization)

Example:

{
  "subject_name": "HTML Structure",
  "weight": 40,
  "tests": [
    {
      "name": "has_tag",
      "parameters": {"tag": "header"},
      "weight": 50
    }
  ]
}

Tests

Individual grading checks within subjects or categories.

Properties: - name: Test function name from template library - parameters: Test-specific configuration - weight: Relative importance within parent (0-100)

Example:

{
  "name": "expect_output",
  "parameters": {
    "inputs": ["Alice"],
    "expected_output": "Hello, Alice!",
    "program_command": "python3 greeting.py"
  },
  "weight": 100
}

Result Tree

The result tree is the output structure after grading. It mirrors the criteria tree but includes actual scores and reports.

Structure

{
  "final_score": 85.5,
  "root": {
    "base": {
      "name": "base",
      "score": 85.5,
      "weight": 100,
      "max_score": 100,
      "subjects": [
        {
          "name": "Functionality",
          "score": 90,
          "weight": 60,
          "max_score": 60,
          "tests": [
            {
              "name": "expect_output",
              "score": 100,
              "weight": 100,
              "max_score": 100,
              "report": "Test passed successfully!",
              "passed": true
            }
          ]
        },
        {
          "name": "Code Quality",
          "score": 80,
          "weight": 40,
          "max_score": 40,
          "tests": [
            {
              "name": "check_syntax",
              "score": 80,
              "weight": 100,
              "max_score": 100,
              "report": "Minor style issues detected",
              "passed": false
            }
          ]
        }
      ]
    },
    "bonus": {
      "name": "bonus",
      "score": 5,
      "weight": 10,
      "max_score": 10,
      "tests": [
        {
          "name": "extra_features",
          "score": 50,
          "weight": 100,
          "max_score": 100,
          "report": "Some extra features implemented",
          "passed": false
        }
      ]
    },
    "penalty": {
      "name": "penalty",
      "score": 0,
      "weight": -20,
      "max_score": 0,
      "tests": []
    }
  }
}

Score Calculation

Scores are calculated hierarchically:

  1. Test Level: Raw score (0-100) based on test execution
  2. Subject Level: Weighted average of all tests within subject
  3. Category Level: Weighted average of all subjects (or direct tests)
  4. Final Score: base_score + bonus_score - penalty_score

Example Calculation:

Test 1: 100 (weight: 50) = 50 points
Test 2: 80 (weight: 50) = 40 points
Subject Score: (50 + 40) / 100 = 90

Subject 1: 90 (weight: 60) = 54 points
Subject 2: 80 (weight: 40) = 32 points
Base Score: (54 + 32) / 100 = 86

Bonus Score: 5 (from 10 max)
Penalty Score: 0

Final Score: 86 + 5 - 0 = 91

Grading Result

The complete grading result object returned after pipeline execution.

Properties

class GradingResult:
    final_score: float                     # Overall grade (0-100+)
    feedback: Optional[str] = None         # Generated feedback (if enabled)
    result_tree: Optional[ResultTree] = None  # Detailed score breakdown
    focus: Optional[Focus] = None          # Focus object organizing tests by impact
    error: Optional[str] = None            # Error message (if grading failed)
    failed_at_step: Optional[str] = None   # Pipeline step where failure occurred

Example

{
  "final_score": 91.0,
  "feedback": "Great work! Your program meets most requirements...",
  "result_tree": {
    "final_score": 91.0,
    "root": { ... }
  },
  "focus": {
    "high_impact": [ ... ],
    "medium_impact": [ ... ],
    "low_impact": [ ... ]
  },
  "error": null,
  "failed_at_step": null
}

Pipeline Execution

Internal state object passed between pipeline steps and used to generate API responses.

Properties

class PipelineExecution:
    submission: Submission           # Student files and metadata
    step_results: list[StepResult]  # Results from each pipeline step
    assignment_id: int              # Assignment identifier
    status: PipelineStatus          # Overall execution status (SUCCESS, FAILED, RUNNING)
    result: GradingResult | None    # Final grading result (if grading completed)
    start_time: float               # Execution start timestamp

Pipeline Execution Summary

The PipelineExecutionSerializer (autograder/serializers/pipeline_execution_serializer.py) handles the conversion of an execution into a dictionary for API responses.

Step Status Semantics

During execution, steps return a StepResult which contains a status: - SUCCESS: The step executed correctly and produced the expected output data. - FAIL: The step executed correctly, but encountered a logical domain failure (e.g. pre-flight file check missing required files, or code styling was bad). - INTERRUPTED: The step encountered an unexpected system-level error or unhandled exception. The pipeline base class traps this to prevent full collapse.

PipelineExecutionSerializer.serialize(execution) -> dict

Generates detailed summary for API responses.

Returns:

{
    "status": "success" | "failed",
    "failed_at_step": str | None,
    "total_steps_planned": int,
    "steps_completed": int,
    "execution_time_ms": int,
    "steps": [
        {
            "name": "PreFlightStep",
            "status": "success" | "fail",
            "message": str,
            "error_details": dict | None  # Present if step failed
        }
    ]
}

Example Summary (Success)

{
  "status": "success",
  "failed_at_step": null,
  "total_steps_planned": 8,
  "steps_completed": 8,
  "execution_time_ms": 4521,
  "steps": [
    {"name": "LOAD_TEMPLATE", "status": "success"},
    {"name": "PRE_FLIGHT", "status": "success", "message": "Required files found"},
    {"name": "SANDBOX", "status": "success", "message": "Sandbox environment ready"},
    {"name": "GRADE", "status": "success", "message": "Grading completed: 85.5/100"}
  ]
}

Example Summary (Preflight Failure)

{
  "status": "failed",
  "failed_at_step": "SANDBOX",
  "total_steps_planned": 8,
  "steps_completed": 4,
  "execution_time_ms": 1523,
  "steps": [
    {"name": "LOAD_TEMPLATE", "status": "success"},
    {"name": "BUILD_TREE", "status": "success"},
    {"name": "PRE_FLIGHT", "status": "success"},
    {
      "name": "SANDBOX",
      "status": "fail",
      "message": "Setup command 'Compile Calculator.java' failed",
      "error_details": {
        "error_type": "setup_command_failed",
        "phase": "setup_commands",
        "command_name": "Compile Calculator.java",
        "failed_command": {
          "command": "javac Calculator.java",
          "exit_code": 1,
          "stderr": "Calculator.java:4: error: ';' expected..."
        }
      }
    }
  ]
}

Test Result

Individual test execution result.

Properties

class TestResult:
    test_name: str                            # Name of the test function
    score: float                              # Score achieved (0-100)
    report: str                               # Human-readable result message
    subject_name: str = ""                    # Parent subject name
    parameters: Optional[Dict[str, Any]]      # Test parameters used

Example

{
  "test_name": "expect_output",
  "score": 100,
  "report": "Output matched expected value 'Hello, Alice!'",
  "subject_name": "Functionality",
  "parameters": {
    "inputs": ["Alice"],
    "expected_output": "Hello, Alice!",
    "program_command": "python3 main.py"
  }
}

Submission

Student submission data.

Properties

class Submission:
    username: str                                   # Student identifier
    user_id: int                                    # Student ID
    assignment_id: int                              # Assignment identifier
    submission_files: Dict[str, SubmissionFile]      # Uploaded files keyed by filename
    language: Optional[Language] = None             # Programming language

Submission File

class SubmissionFile:
    filename: str      # Name of file
    content: str       # File contents (text)

Template

Template provides test functions for a specific assignment type.

Properties

class Template:
    name: str                           # Template identifier
    tests: dict[str, TestFunction]      # Available test functions
    requires_sandbox: bool              # Whether code execution needed
    supported_languages: list[str]      # Compatible languages

Test Function

class TestFunction:
    name: str                          # Function identifier

    def execute(
        self,
        files: dict[str, str],
        sandbox: SandboxContainer | None,
        **kwargs
    ) -> TestResult:
        """Execute the test and return result"""
        pass