A Record of a Mortal's Journey to Immortality: Causal Graph System

A Record of a Mortal's Journey to Immortality: Causal Graph System
Automatically adapt novels into scripts, leveraging advanced AI technology to achieve low-cost, high-efficiency content promotion.
Overall Workflow Overview
Original Text Preparation (STEP A0)
Clean and standardize original novel TXT files (e.g., "A Record of a Mortal's Journey to Immortality") by chapter into JSON format.
Text Standardization and Segmentation (STEP A1)
Segment sentences by blank lines/punctuation, maintaining logical integrity, and output structured text roughly segmented by paragraph/event.
LLM Information Extraction (STEP B1)
Extract key events, treasures, characters, outcomes, and context from chapter-segmented text, outputting structured event entries.
Hallucination Detection and Refinement (STEP B2)
Correct hallucinations in LLM extraction results to ensure information accuracy.
Causal Relationship Identification (STEP C1)
Identify causal relationships between cleaned events and label their strength.
DAG Construction (STEP C2)
Construct a Directed Acyclic Graph (DAG) structure based on the list of causal pairs.
Mermaid Graph Output (STEP D1)
Convert the DAG graph structure into visualized Mermaid graph code.
Theoretical Support: Reader-Rewriter (R2) Framework
The R2 framework aims to automatically adapt novels into scripts, promoting content at a low cost. It addresses two major challenges: inconsistencies in plot extraction and script generation due to LLM hallucinations, and how to effectively extract causally embedded plot lines for coherent rewriting.
We propose two strategies: the Hallucination-Aware Refinement (HAR) method for iteratively discovering and eliminating hallucinations; and the Causal Plot Graph Construction (CPC) method for effectively building plot lines with event causal relationships.
R2 Framework Core Modules
Reader Module
Utilizes sliding windows and CPC to construct a causal plot graph for extracting accurate event and character profiles.
Character and Event Extraction: Identifies plot events chapter by chapter, handling long texts.
Plot Graph Extraction: Constructs a causal plot graph using CPC, eliminating cycles and low-weight edges.
Rewriter Module
Generates scene outlines based on the plot graph, then generates the script, and integrates HAR for accurate reasoning.
Outline Generation: Creates a script adaptation outline, including core story elements, structure, and writing plan.
Script Generation: Generates each scene according to the scene writing plan, with HAR verifying consistency.
Core Technologies & Algorithms
LLM Information Extraction
Utilize GPT-4o / Claude 3 models for efficient information extraction.
Hallucination-Aware Refinement (HAR)
Correct hallucinations in LLM output through pre-refinement prompt rewriting.
Causal Relationship Identification
LLM determines causal relationships, optimizing with a "two paths, merge and converge" strategy and inverse adjustment based on entity frequency weights.
DAG Construction
Build Directed Acyclic Graphs using a greedy cycle-breaking algorithm.
Hallucination-Aware Refinement (HAR) Mechanism
HAR prompts LLMs to identify internal inconsistencies caused by hallucinations, locate where they occur, and provide refinement suggestions. It goes through an iterative self-refinement process until the initial input data is fully processed and consistent.
Locate Hallucination & Suggest Refinement
The LLM identifies internal inconsistencies in the input and provides refinement suggestions.
Retrieve Context
Based on the hallucination's location, hallucination-aware context is extracted from supporting text.
Obtain Refined Portion
The LLM uses prompts, context, and suggestions to generate the corrected portion.
Update & Iterate
The corrected portion is integrated back into the original input, and the process iterates until consistency is achieved.
Causal Plot Graph Construction (CPC)
The Causal Plot Graph embeds event causal relationships through a graph, representing plot events (nodes) and their causal relationships (edges) as a directed acyclic graph (DAG). Relationship strengths are categorized into high, medium, and low.
1
2
3
1
High
Direct and significant influence
2
Medium
Partial or indirect influence
3
Low
Minimal or weak influence
We propose a greedy cycle-breaking algorithm that removes cycles and low-strength relationships contained within the plot graphs extracted by LLMs, based on relationship strength and event node degrees. This ensures the graph is acyclic and retains the most important causal relationships.
Experimental Results and Ablation Study
R2 significantly outperforms ROLLING, Dramatron, and Wawa Writer in both GPT-4o and human evaluations, especially excelling in wording & grammar, interestingness, transitions, and consistency.
GPT-4o Evaluation Results
Ablation studies show that removing HAR leads to a significant decrease in wording & grammar and consistency, proving HAR's critical role in language quality and coherence. Removing CPC results in a significant drop in interestingness and consistency, indicating CPC's importance in generating engaging and consistent scripts. Excluding all context support drastically reduces transitions and consistency.
These results validate the crucial roles of HAR and CPC within the R2 framework.
Code Structure Design
The "A Record of a Mortal's Journey to Immortality: Causal Graph System" adopts a microservice architecture, with each functional domain serving independently, using "module division + layered design" internally.
Top-level Directory Structure
The project root directory includes api_gateway, causal_linking, common, graph_builder, hallucination_refine, log_doc, novel, reports, scripts, tests, text_ingestion, logs, output, main.py, requirements.txt, and README.md.
Microservice Structure Specification
Each microservice (e.g., causal_linking) includes app.py, controller/, service/, repository/, domain/, and di/, achieving clear separation of responsibilities.
Core Service Design Checklist
Includes text_ingestion (loading TXT novels), event_extraction (extracting events), hallucination_refine (refining LLM output), causal_linking (building causal pairs), graph_builder (building Mermaid graph code), and api_gateway (integrating services).
Abstract Interfaces and Implementation
Interfaces are defined through abstract base classes, such as AbstractLinker, and provide a unified implementation to support dynamic switching of linking strategies.
Dependency Injection Upgrade
Provides the functionality to select different linker instances based on configuration, achieving flexible strategy selection.
Testing Framework and Integration Testing
Unified testing scripts and phased testing strategies (stage_1, stage_2, stage_3) ensure system reliability.
Made with