[
  {
    "id": "01",
    "category": "Executive Summary & Strategy",
    "prompt_title": "Prompt 1: Generate the Executive Summary",
    "prompt_file": "fdt_prompt_library/executive_summary_and_strategy/01_generate_executive_summary.md",
    "prompt_content": "### **Prompt 1: Generate the Executive Summary**\n\n\"Draft a concise executive summary for a strategic blueprint on implementing a **Financial Digital Twin (FDT)** for lending operations. The summary must cover:\n1.  **The Problem:** The challenges of the modern financial landscape (volatility, competition) and the limitations of traditional, siloed data systems in lending.\n2.  **The Solution:** The vision of the FDT as a living, virtual replica of the lending ecosystem, moving the institution from reactive reporting to predictive foresight.\n3.  **Core Capabilities:** Mention real-time simulation, predictive risk analysis, automated compliance, and hyper-personalized products.\n4.  **The Architecture:** Briefly describe the hybrid architecture centered on a knowledge graph and powered by an agentic framework.\n5.  **The Roadmap & ROI:** Reference a phased, three-year implementation plan and state the projected business outcomes, such as a 10-15% reduction in credit losses, 80% automation of regulatory reporting, and a 5% increase in loan origination.\"\n",
    "artifact_title": "Executive Summary: Strategic Blueprint for the Financial Digital Twin (FDT)",
    "artifact_file": "fdt_artifacts/executive_summary_and_strategy/01_executive_summary.md",
    "artifact_content": "# Executive Summary: Strategic Blueprint for the Financial Digital Twin (FDT)\n\n## The Problem: Navigating the New Financial Landscape\n\nThe modern financial landscape is defined by unprecedented volatility, fierce competition from agile FinTechs, and increasing regulatory scrutiny. Traditional lending operations, reliant on siloed data systems (LOS, Core, Servicing), are struggling to keep pace. This fragmented approach results in a reactive, backward-looking posture, creating significant challenges in risk management, operational efficiency, and customer satisfaction. Key data remains trapped in disparate systems, hindering the ability to gain a holistic view of risk and opportunity, leaving the institution vulnerable to unforeseen market shifts and sophisticated fraud.\n\n## The Solution: A Shift from Hindsight to Foresight with the FDT\n\nThis blueprint outlines the vision and implementation of a Financial Digital Twin (FDT)\u2014a living, virtual replica of the institution's entire lending ecosystem. The FDT will fuse disparate data sources into a dynamic, computable model of the business, enabling a paradigm shift from reactive reporting to predictive and prescriptive foresight. By simulating the real world in a virtual environment, the FDT will empower the institution to anticipate risks, stress-test portfolios, and optimize decisions with a high degree of confidence.\n\n## Core Capabilities\n\nThe FDT will deliver a suite of transformative capabilities:\n\n*   **Real-Time Simulation:** Model the impact of market shocks, interest rate changes, or credit events on the entire portfolio in real-time.\n*   **Predictive Risk Analysis:** Leverage AI and graph analytics to uncover hidden counterparty risks, predict defaults, and identify emerging fraud patterns.\n*   **Automated Compliance:** Continuously monitor transactions and relationships against regulatory watchlists and internal policies, automating the generation of compliance reports.\n*   **Hyper-Personalized Products:** Simulate customer behavior and market response to design and offer tailored lending products that optimize both customer value and institutional profitability.\n\n## The Architecture: A Knowledge-Centric, Agent-Driven Platform\n\nThe FDT is built on a modern, hybrid data architecture centered on an enterprise **Knowledge Graph**. This semantic core connects all entities\u2014customers, loans, collateral, guarantors, and market events\u2014into a rich, contextualized network. This eliminates data silos and enables complex, multi-hop reasoning that is impossible with traditional databases. An **Agentic Framework**, composed of specialized AI agents, will work collaboratively on this knowledge graph to automate monitoring, analysis, and reporting tasks.\n\n## The Roadmap & Projected ROI\n\nThe FDT will be implemented via a phased, three-year roadmap, beginning with foundational data integration and culminating in advanced causal simulation capabilities. The initiative is projected to deliver significant business outcomes, including:\n\n*   **10-15% reduction in credit losses** through early risk detection and predictive modeling.\n*   **80% automation of regulatory reporting**, reducing operational costs and compliance risks.\n*   **5% increase in loan origination** by identifying new market opportunities and accelerating product development.\n\nThis strategic investment will not only fortify the institution's risk posture but also serve as a platform for sustained innovation and competitive advantage in the digital age.\n"
  },
  {
    "id": "02",
    "category": "Executive Summary & Strategy",
    "prompt_title": "Prompt 2: Explain the Strategic Imperative",
    "prompt_file": "fdt_prompt_library/executive_summary_and_strategy/02_explain_strategic_imperative.md",
    "prompt_content": "### **Prompt 2: Explain the Strategic Imperative**\n\n\"Explain the strategic imperative for a financial institution to transition from a traditional lending operation to an **'Intelligent Lending Ecosystem.'** Your explanation should:\n1.  Describe the evolving risk landscape, including market volatility, geopolitical risks, and sophisticated fraud.\n2.  Highlight the competitive pressures from agile FinTech companies.\n3.  Critique the traditional, siloed approach to data management (LOS, servicing, risk systems) and explain how it leads to a reactive, backward-looking risk posture.\"\n",
    "artifact_title": "The Strategic Imperative for an Intelligent Lending Ecosystem",
    "artifact_file": "fdt_artifacts/executive_summary_and_strategy/02_strategic_imperative.md",
    "artifact_content": "# The Strategic Imperative for an Intelligent Lending Ecosystem\n\nThe transition from a traditional, siloed lending operation to an **Intelligent Lending Ecosystem** is no longer a strategic choice but a critical imperative for survival and growth in the modern financial industry. This shift is driven by a confluence of powerful forces that render legacy approaches obsolete and dangerous.\n\n## 1. The Evolving Risk Landscape\n\nThe nature of risk has fundamentally changed. Financial institutions now face a multi-faceted and dynamic threat environment that traditional systems are ill-equipped to handle.\n\n*   **Market Volatility:** Global markets are increasingly interconnected and susceptible to rapid, unforeseen shocks, from pandemics to geopolitical conflicts. A static, quarterly review of risk is insufficient to protect the portfolio from high-velocity events.\n*   **Geopolitical & Climate Risks:** Events in one part of the world can have cascading impacts on supply chains, commodity prices, and specific industries, creating complex, correlated risks across the lending book that are difficult to track manually.\n*   **Sophisticated Fraud:** Fraudsters are no longer lone actors but organized networks leveraging synthetic identities and coordinated schemes to attack institutions at scale. Detecting these patterns requires analyzing relationships and behaviors across the entire portfolio, not just individual accounts.\n\n## 2. Competitive Pressures from FinTech\n\nAgile, data-native FinTech companies and neobanks are relentlessly eroding the market share of traditional institutions. Their competitive advantages are built on a foundation of modern data architecture and AI:\n\n*   **Speed and Agility:** They can approve loans in minutes, not weeks, by leveraging real-time data and automated decisioning.\n*   **Hyper-Personalization:** They use data to create tailored products and experiences, increasing customer acquisition and loyalty.\n*   **Operational Efficiency:** Their lean, automated operations allow them to operate at a lower cost base, offering more competitive rates.\n\nWithout a commensurate investment in an intelligent data foundation, incumbent institutions will be unable to compete on speed, price, or customer experience.\n\n## 3. The Failure of the Siloed, Reactive Posture\n\nThe traditional approach to data management is the primary obstacle to navigating this new reality. Lending operations are typically fragmented across multiple, disconnected systems:\n\n*   **Loan Origination System (LOS):** Contains underwriting and application data.\n*   **Servicing Platform:** Manages ongoing payments and loan performance.\n*   **Risk Systems:** House credit scores and risk models.\n*   **Core Banking System:** Holds customer deposit and relationship data.\n\nThis siloing of data creates a **reactive, backward-looking risk posture**. Analysts spend their time manually stitching together data from different sources to produce historical reports. By the time a risk is identified, it has already materialized. There is no capacity for real-time monitoring, predictive analysis, or portfolio-wide simulation. This fragmentation makes it impossible to answer critical questions quickly, such as \"What is our total exposure to a specific industry that is being impacted by a new tariff?\" or \"Which of our borrowers are connected to this newly sanctioned entity?\"\n\nThe strategic imperative is clear: to survive and thrive, financial institutions must break down these silos and build a unified, intelligent ecosystem that enables them to see, understand, and act on risk and opportunity at the speed of the modern market.\n"
  },
  {
    "id": "03",
    "category": "Executive Summary & Strategy",
    "prompt_title": "Prompt 3: Compare Legacy Data Architectures",
    "prompt_file": "fdt_prompt_library/executive_summary_and_strategy/03_compare_legacy_data_architectures.md",
    "prompt_content": "### **Prompt 3: Compare Legacy Data Architectures**\n\n\"Compare and contrast the **Data Warehouse** and the **Data Lake** in the context of modern financial lending operations. For each architecture, explain:\n1.  Its core paradigm (e.g., 'schema-on-write' vs. 'schema-on-read').\n2.  Its primary strengths and weaknesses for handling diverse financial data (structured, unstructured).\n3.  Why both are ultimately inadequate for the FDT's vision of a real-time, predictive, and simulation-ready platform.\"\n",
    "artifact_title": "Legacy Data Architectures: A Comparative Analysis for Lending Operations",
    "artifact_file": "fdt_artifacts/executive_summary_and_strategy/03_legacy_data_architectures.md",
    "artifact_content": "# Legacy Data Architectures: A Comparative Analysis for Lending Operations\n\nTo understand the architectural necessity of the Financial Digital Twin (FDT), it is crucial to first analyze the two dominant legacy data paradigms: the Data Warehouse and the Data Lake. While both have served valuable purposes, their inherent limitations render them inadequate for the demands of a real-time, predictive lending ecosystem.\n\n## The Data Warehouse\n\nThe Data Warehouse has been the cornerstone of business intelligence for decades, providing a reliable source for historical reporting and analysis.\n\n### 1. Core Paradigm: Schema-on-Write\n\nThe Data Warehouse operates on a **\"schema-on-write\"** model. Data from various operational systems (like LOS, servicing platforms) undergoes a rigorous Extract, Transform, Load (ETL) process. During this process, the data is cleaned, conformed, and loaded into a predefined, highly structured relational schema (typically a star or snowflake schema). The structure is defined *before* data is written to the warehouse.\n\n### 2. Strengths and Weaknesses\n\n*   **Strengths:**\n    *   **High Data Quality & Consistency:** The rigid ETL process ensures that data is standardized and reliable.\n    *   **Optimized for Reporting:** The structured schema is highly optimized for fast, efficient querying for known business questions and generating standard reports.\n    *   **Business-Friendly:** Data is modeled in familiar business terms (e.g., customers, loans, payments), making it accessible to business analysts.\n\n*   **Weaknesses:**\n    *   **Inflexible:** The predefined schema is brittle and expensive to change. Adding new data sources or changing business requirements can take months.\n    *   **Poor Handling of Unstructured Data:** Relational models are not designed to handle the growing volume of unstructured and semi-structured data crucial for modern risk analysis (e.g., legal documents, news articles, social media data, call center notes).\n    *   **Batch-Oriented:** The ETL process typically runs in batches (daily or weekly), meaning the data is never real-time. It reflects the state of the business as of *yesterday*.\n\n## The Data Lake\n\nThe Data Lake emerged as a response to the inflexibility and data type limitations of the Data Warehouse, driven by big data technologies.\n\n### 1. Core Paradigm: Schema-on-Read\n\nThe Data Lake follows a **\"schema-on-read\"** philosophy. It is a vast storage repository (like HDFS or cloud object storage) that ingests raw data from source systems in its native format\u2014structured, semi-structured, or unstructured. The data is loaded first (Extract, Load, Transform or ELT), and a structure or schema is applied only when the data is read for a specific analytical purpose.\n\n### 2. Strengths and Weaknesses\n\n*   **Strengths:**\n    *   **Extreme Flexibility:** It can store any type of data without needing a predefined schema, making it easy to ingest new data sources.\n    *   **Cost-Effective Storage:** Utilizes low-cost commodity hardware or cloud storage.\n    *   **Ideal for Data Science:** Data scientists can access raw, untransformed data to explore, experiment, and build predictive models.\n\n*   **Weaknesses:**\n    *   **Risk of Becoming a \"Data Swamp\":** Without strong governance, the lack of structure can lead to a chaotic and unusable repository of data.\n    *   **Complex to Query:** Querying requires specialized skills and tools. Business users cannot easily access or make sense of the raw data.\n    *   **Inconsistent Performance:** Query performance can be slow and unpredictable compared to a highly optimized data warehouse.\n\n## 3. Why Both Are Inadequate for the FDT\n\nWhile a modern architecture may leverage elements of both (in the form of a \"Lakehouse\"), neither paradigm alone can fulfill the vision of the Financial Digital Twin.\n\n*   **Lack of Real-Time Capability:** The Data Warehouse is inherently batch-oriented. The Data Lake *can* support streaming ingestion, but it is not optimized for the real-time, low-latency query performance needed for operational simulation and immediate risk alerts.\n*   **Inability to Model Relationships:** Both architectures struggle to model and query the complex, many-to-many relationships at the heart of lending (e.g., a single person can be a borrower on one loan, a guarantor on another, and a director of a company that is also a borrower). Answering relationship-based questions requires complex, inefficient, and slow `JOIN` operations in a relational model or complex code in a data lake.\n*   **Not Simulation-Ready:** The FDT's core purpose is to be a *computable, dynamic model* of the business. It needs to not only store data but also represent the connections, dependencies, and behaviors of the system. Neither a static relational schema nor a collection of raw files in a lake provides the semantic richness or graph-native query capabilities required to run \"what-if\" scenarios or perform multi-hop risk analysis efficiently.\n\nThe FDT requires a new core\u2014one built not just for storing data, but for understanding relationships and simulating outcomes. This is why a Knowledge Graph is the necessary evolution beyond these legacy architectures.\n"
  },
  {
    "id": "04",
    "category": "Executive Summary & Strategy",
    "prompt_title": "Prompt 4: Define the FDT Vision and Business Alignment",
    "prompt_file": "fdt_prompt_library/executive_summary_and_strategy/04_define_fdt_vision.md",
    "prompt_content": "### **Prompt 4: Define the FDT Vision and Business Alignment**\n\n\"Articulate the vision for the **Financial Digital Twin (FDT)**, emphasizing the paradigm shift from **'Hindsight to Foresight.'**\n1.  Define the FDT as a living, dynamic, computable model of the entire lending portfolio.\n2.  Detail its three core value propositions: **Holistic Situational Awareness**, **Predictive and Prescriptive Analytics**, and **Intelligent Automation**.\n3.  Align these capabilities with four core business objectives: Enhanced Risk Management, Improved Operational Efficiency, Accelerated Revenue Growth, and Robust Regulatory Compliance. Provide specific, measurable targets for each objective.\"\n",
    "artifact_title": "The Vision for the Financial Digital Twin: From Hindsight to Foresight",
    "artifact_file": "fdt_artifacts/executive_summary_and_strategy/04_fdt_vision_and_business_alignment.md",
    "artifact_content": "# The Vision for the Financial Digital Twin: From Hindsight to Foresight\n\nThe vision for the Financial Digital Twin (FDT) represents a fundamental paradigm shift in how the financial institution perceives and manages its operations. We are moving away from the limitations of historical, batch-oriented reporting (\"Hindsight\") and toward a future of dynamic, real-time, predictive analysis (\"Foresight\").\n\n## 1. Defining the Financial Digital Twin\n\nThe FDT is a **living, dynamic, and computable model of the entire lending portfolio and its operating environment.** It is not merely a dashboard or a database; it is a virtual replica of the real world that continuously ingests data from internal systems (LOS, Servicing, Core) and external sources (market data, news feeds, regulatory lists). This model understands the entities within our ecosystem (borrowers, loans, collateral) and, crucially, the complex web of relationships that connect them. By representing the business as a computable model, we can simulate, query, and analyze it in ways that are impossible with traditional data architectures.\n\n## 2. Core Value Propositions\n\nThe FDT delivers value across three core pillars, transforming data from a passive, historical record into an active, strategic asset.\n\n*   **Holistic Situational Awareness:** By breaking down data silos and fusing all relevant information into a single, unified view, the FDT provides a complete, 360-degree understanding of risk and opportunity. Analysts can instantly see a borrower's total exposure, including their direct loans, guaranteed loans, and hidden connections to other entities in the portfolio.\n*   **Predictive and Prescriptive Analytics:** The FDT moves beyond asking \"What happened?\" to answer \"What is likely to happen, and what should we do about it?\". Using AI/ML and graph neural networks, the FDT will predict credit defaults, identify sophisticated fraud rings, and forecast the impact of economic scenarios on the portfolio. It will prescribe actions, such as proactive outreach to at-risk customers or adjustments to lending criteria.\n*   **Intelligent Automation:** A framework of collaborative AI agents will work on top of the FDT to automate high-volume, repetitive tasks. This includes continuous compliance monitoring, covenant checking, and the generation of regulatory reports, freeing up human experts to focus on high-value strategic activities.\n\n## 3. Alignment with Core Business Objectives\n\nThe capabilities of the FDT are directly aligned with the institution's most critical business objectives. We will measure success against the following specific and measurable targets:\n\n| Business Objective                | FDT Alignment & Key Capabilities                                                                                                                                                                                            | Measurable Targets (Year 3)                                                                                                                                                                                                                               |\n| --------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| **Enhanced Risk Management**      | The FDT's predictive analytics and holistic view of counterparty risk will allow for the early identification and mitigation of credit, market, and operational risks. Causal simulation will enable robust portfolio stress testing. | \u2022 **Reduce credit losses by 10-15%.**<br>\u2022 **Reduce time to detect critical counterparty risk from weeks to minutes.**<br>\u2022 **Increase portfolio stress testing scenarios by 10x.**                                                                           |\n| **Improved Operational Efficiency** | Intelligent automation of compliance, reporting, and monitoring tasks will significantly reduce manual effort, operational costs, and the risk of human error.                                                              | \u2022 **Automate 80% of regulatory reporting (e.g., BCBS 239, SAR).**<br>\u2022 **Reduce manual data reconciliation effort by 90%.**<br>\u2022 **Decrease average loan underwriting time by 25%.**                                                                      |\n| **Accelerated Revenue Growth**    | Hyper-personalized product design and a deeper understanding of market dynamics will enable the institution to identify new customer segments, accelerate product development, and improve cross-selling opportunities.         | \u2022 **Increase loan origination by 5%.**<br>\u2022 **Reduce time-to-market for new lending products by 50%.**<br>\u2022 **Increase customer lifetime value by 7% through improved cross-sell effectiveness.**                                                             |\n| **Robust Regulatory Compliance**  | The FDT provides a transparent, auditable, and real-time view of the entire portfolio, ensuring adherence to regulations like BCBS 239. Automated monitoring provides a continuous and proactive compliance posture.         | \u2022 **Achieve 100% automated lineage for critical data elements.**<br>\u2022 **Eliminate fines related to sanctions screening errors.**<br>\u2022 **Reduce audit preparation time by 60% by providing auditors with direct, transparent access to required data.** |\n"
  },
  {
    "id": "05",
    "category": "Semantic Foundation & Data Modeling",
    "prompt_title": "Prompt 5: Explain the Knowledge Graph Core",
    "prompt_file": "fdt_prompt_library/semantic_foundation_and_data_modeling/05_explain_knowledge_graph_core.md",
    "prompt_content": "### **Prompt 5: Explain the Knowledge Graph Core**\n\n\"Explain why a **Knowledge Graph** is the ideal semantic core for a Financial Digital Twin, as opposed to a traditional relational database. Your explanation should cover:\n1.  How knowledge graphs model data as a network of entities (nodes) and relationships (edges).\n2.  The inefficiency of using complex `JOIN` operations in relational databases to model the interconnected nature of lending (borrower, loan, collateral, guarantor).\n3.  The knowledge graph's native ability to perform rapid, multi-hop reasoning to uncover hidden risks and complex connections.\"\n",
    "artifact_title": "The Knowledge Graph: The Semantic Core of the Financial Digital Twin",
    "artifact_file": "fdt_artifacts/semantic_foundation_and_data_modeling/05_knowledge_graph_core.md",
    "artifact_content": "# The Knowledge Graph: The Semantic Core of the Financial Digital Twin\n\nThe choice of a data architecture is the most critical decision in building a Financial Digital Twin (FDT). A traditional relational database, while familiar, is fundamentally unsuited for the core task of the FDT: understanding a complex, interconnected ecosystem in real-time. The ideal semantic core for the FDT is a **Knowledge Graph**.\n\n## 1. Modeling Data as a Network\n\nUnlike a relational database, which stores data in rigid rows and columns, a knowledge graph models the world as a network of entities and their relationships.\n\n*   **Nodes (Entities):** Nodes represent the key entities in the lending domain. These are the \"nouns\" of our business. Examples include: `Customer`, `Loan`, `Collateral`, `Company`, `Guarantor`, `Address`. Each node can have properties, such as a `Customer` node having a `name` and `creditScore`.\n*   **Edges (Relationships):** Edges are the verbs that connect the nouns. They represent the rich, contextual relationships between entities. Examples include: a `Customer` `HAS_LOAN` a `Loan`; a `Company` `EMPLOYS` a `Customer`; a `Guarantor` `GUARANTEES` a `Loan`; a `Loan` is `SECURED_BY` `Collateral`.\n\nThis network model is a more intuitive and powerful way to represent a lending portfolio, which is itself a complex network of financial and legal relationships.\n\n## 2. The Inefficiency of Relational `JOIN`s\n\nTo understand the knowledge graph's advantage, consider a simple risk question: \"Show me all the loans that are secured by the same piece of collateral as a loan held by 'John Doe'.\"\n\nIn a **relational database**, answering this requires a series of complex `JOIN` operations across multiple tables:\n1.  Find 'John Doe' in the `Customer` table.\n2.  `JOIN` with the `Loan` table to find his loans.\n3.  `JOIN` with a `Loan_Collateral_Link` table to find the collateral IDs.\n4.  `JOIN` *back* to the `Loan_Collateral_Link` table to find *other* loans with the same collateral IDs.\n5.  `JOIN` with the `Loan` table again to get details on those other loans.\n\nThis query is complex to write, difficult to maintain, and, most importantly, **computationally expensive and slow**. As the number of `JOIN`s increases, query performance degrades exponentially. Asking a more complex, multi-hop question (e.g., \"Find all loans whose guarantors are directors of companies that have defaulted on *their* loans\") can become practically impossible to answer in real-time.\n\n## 3. The Power of Multi-Hop Reasoning\n\nIn a **knowledge graph**, relationships are not computed at query time with `JOIN`s; they are stored as first-class citizens. The same query becomes simple and incredibly fast. The query engine starts at the 'John Doe' node and simply \"walks\" the graph, traversing the pre-existing relationships.\n\nThis native ability to perform rapid, multi-hop reasoning is the superpower of the knowledge graph. It allows analysts to instantly uncover hidden risks and complex connections that would be invisible in a relational world:\n\n*   **Counterparty Risk:** Discover that two seemingly unrelated borrowers are connected because they share a director, a guarantor, or a shell company address.\n*   **Fraud Rings:** Identify sophisticated fraud networks where multiple synthetic identities are linked together in non-obvious ways.\n*   **Contagion Analysis:** Simulate how the default of one company could cascade through the portfolio by impacting its suppliers, customers, and guarantors, who are also our borrowers.\n\nThe knowledge graph provides the only viable foundation for the FDT, enabling the shift from asking simple questions about siloed data to exploring the entire interconnected system and asking deep, predictive questions about the future.\n"
  },
  {
    "id": "06",
    "category": "Semantic Foundation & Data Modeling",
    "prompt_title": "Prompt 6: Design a Proprietary Lending Ontology",
    "prompt_file": "fdt_prompt_library/semantic_foundation_and_data_modeling/06_design_proprietary_lending_ontology.md",
    "prompt_content": "### **Prompt 6: Design a Proprietary Lending Ontology**\n\n\"Design a proprietary ontology extension for lending operations that builds upon the **Financial Industry Business Ontology (FIBO)**.\n1.  State the purpose: to model concepts specific to our lending business not covered in the general FIBO standard.\n2.  Outline the methodical development process: identify core concepts, define properties, and link them to the FIBO hierarchy.\n3.  Provide a concrete code example in Turtle (`.ttl`) format that defines a `lending:LoanCovenant` class as a subclass of a relevant FIBO class and an object property `lending:violatesCovenant`.\"\n",
    "artifact_title": "Design for a Proprietary Lending Ontology Extension",
    "artifact_file": "fdt_artifacts/semantic_foundation_and_data_modeling/06_proprietary_lending_ontology.md",
    "artifact_content": "# Design for a Proprietary Lending Ontology Extension\n\nFor the Financial Digital Twin to operate with precision, it requires a \"common language\" or ontology that defines the concepts and relationships specific to our business. While industry standards like the **Financial Industry Business Ontology (FIBO)** provide an excellent foundation, they are by nature generic. To capture our unique business logic, covenants, and risk models, we must design a proprietary ontology extension.\n\n## 1. Purpose of the Proprietary Ontology\n\nThe purpose of the `lending:` ontology is to **formally model the specific entities, rules, and relationships that constitute our unique lending operations and risk management framework.** This extension will not replace FIBO but will build upon it, creating a richer, more specific semantic model that allows the FDT to:\n\n*   Represent custom loan covenants and conditions not found in FIBO.\n*   Model internal risk ratings and scoring methodologies.\n*   Define specific roles and relationships unique to our business processes (e.g., specific types of guarantors or collateral).\n*   Enable more precise and context-aware queries and AI agent behaviors.\n\n## 2. Methodical Development Process\n\nThe development of the `lending:` ontology will follow a structured, iterative process involving subject matter experts, data modelers, and technical architects.\n\n1.  **Identify Core Concepts:** Begin by interviewing business stakeholders (underwriters, risk analysts, compliance officers) to identify the core concepts, entities, and rules that are central to their daily operations but are not adequately covered by FIBO. This includes concepts like `LoanCovenant`, `InternalRiskScore`, `CollateralValuation`, etc.\n2.  **Define Classes and Properties:** For each identified concept, define it formally as a `Class` in the ontology. For each class, define its `data properties` (attributes with literal values, like a `covenantDescription` string) and its `object properties` (relationships to other classes, like `violatesCovenant`).\n3.  **Link to FIBO Hierarchy:** This is a critical step. Whenever possible, our new classes will be defined as `rdfs:subClassOf` a relevant FIBO class. This anchors our proprietary model to the industry standard, ensuring interoperability and a logical, hierarchical structure. For example, our `lending:LoanCovenant` might be a subclass of `fibo-fbc-fi-fi:FinancialInstrumentTerm`. This allows us to inherit properties and relationships from the parent class.\n4.  **Iterate and Refine:** The ontology is a living model. It will be version-controlled and continuously refined as new products are introduced, regulations change, and the needs of the business evolve.\n\n## 3. Concrete Example in Turtle (.ttl)\n\nBelow is a sample snippet of our proprietary ontology, written in the Turtle serialization format. This example defines a `LoanCovenant` class, an object property `violatesCovenant` to link a `Loan` to a `LoanCovenant` it has breached, and a specific instance of a covenant.\n\n```turtle\n@prefix : <http://ourbank.com/ontology/lending#> .\n@prefix fibo-fbc-fi-fi: <https://spec.edmcouncil.org/fibo/ontology/FBC/FinancialInstruments/FinancialInstruments/> .\n@prefix fibo-be-le-lp: <https://spec.edmcouncil.org/fibo/ontology/BE/LegalEntities/LegalPersons/> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n#################################################################\n#    Proprietary Ontology: http://ourbank.com/ontology/lending#\n#################################################################\n\n###  Class Definitions\n\n# Defines a specific rule or condition attached to a loan.\n# We make it a subclass of a general FIBO concept for financial terms.\n:LoanCovenant rdf:type rdfs:Class ;\n              rdfs:subClassOf fibo-fbc-fi-fi:FinancialInstrumentTerm ;\n              rdfs:label \"Loan Covenant\" ;\n              rdfs:comment \"A specific rule or condition contractually agreed upon as part of a loan, which if violated, may trigger specific actions such as a penalty or default.\" .\n\n###  Property Definitions\n\n# Defines the relationship between a Loan and a Covenant that it has violated.\n:violatesCovenant rdf:type rdf:Property ;\n                  rdfs:domain fibo-fbc-fi-fi:Loan ;\n                  rdfs:range :LoanCovenant ;\n                  rdfs:label \"violates covenant\" .\n\n# A data property to describe the covenant.\n:covenantDescription rdf:type rdf:Property ;\n                     rdfs:domain :LoanCovenant ;\n                     rdfs:range xsd:string ;\n                     rdfs:label \"covenant description\" .\n\n### Example Instance\n\n# An example of a specific covenant that can be attached to loans.\n:Covenant-DebtServiceCoverageRatio-1.5x\n    rdf:type :LoanCovenant ;\n    :covenantDescription \"Borrower must maintain a Debt Service Coverage Ratio (DSCR) of at least 1.5x, tested quarterly.\" .\n\n```\n\nThis structured approach ensures that our proprietary ontology is robust, maintainable, and firmly rooted in industry best practices, providing the semantic precision required for the FDT.\n"
  },
  {
    "id": "07",
    "category": "Architecture & Technology Stack",
    "prompt_title": "Prompt 7: Compare Data Orchestration Tools",
    "prompt_file": "fdt_prompt_library/architecture_and_technology_stack/07_compare_data_orchestration_tools.md",
    "prompt_content": "### **Prompt 7: Compare Data Orchestration Tools**\n\n\"Create a comparative analysis of three data workflow orchestration tools for the FDT's integration fabric: **Apache Airflow, Dagster, and Prefect.**\n1.  Structure the comparison as a markdown table with criteria such as: Core Paradigm (e.g., task-centric vs. asset-centric), Development Experience, Data Lineage support, and Local Testing.\n2.  Based on the analysis, provide a clear recommendation for the FDT project, justifying the choice. Emphasize alignment with the FDT's strategic goals (e.g., why Dagster's asset-centric model is a good fit).\"\n",
    "artifact_title": "Comparative Analysis of Data Workflow Orchestration Tools",
    "artifact_file": "fdt_artifacts/architecture_and_technology_stack/07_data_orchestration_tools_comparison.md",
    "artifact_content": "# Comparative Analysis of Data Workflow Orchestration Tools\n\nThe integration fabric of the Financial Digital Twin (FDT) is responsible for orchestrating the flow of data from source systems into the converged data platform. The choice of a workflow orchestration tool is critical for ensuring reliability, maintainability, and data quality. This analysis compares three leading tools: Apache Airflow, Prefect, and Dagster.\n\n## 1. Feature Comparison\n\n| Criterion               | Apache Airflow                                                              | Prefect                                                                          | Dagster                                                                                                  |\n| ----------------------- | --------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------- |\n| **Core Paradigm**       | **Task-centric.** Defines workflows as Directed Acyclic Graphs (DAGs) of tasks. The focus is on task execution and dependencies. | **Task- & Flow-centric.** Abstracts Python functions into tasks and groups them into flows. Focuses on robust workflow execution. | **Asset-centric.** Models workflows as graphs of software-defined assets (e.g., tables, files, models). The focus is on the data assets being produced. |\n| **Development Experience** | **Configuration-as-code (Python).** Can be verbose. Requires a full deployment to test changes. The UI is primarily for monitoring, not development. | **Python-native.** Uses simple Python decorators. Feels more intuitive and less boilerplate-heavy. Hybrid execution model supports local and remote development. | **Python-native & Asset-focused.** Uses decorators to define assets and their upstream dependencies. Encourages a declarative style. The UI is a powerful development tool. |\n| **Data Lineage**        | **Limited native support.** Primarily tracks task status. Data lineage requires third-party tools like Marquez or OpenLineage for deep integration. | **Good support.** The UI provides visibility into task inputs and outputs. The upcoming \"Marvin\" AI feature promises automated lineage and documentation. | **Excellent, built-in support.** Because it is asset-centric, data lineage is a first-class citizen. The UI provides a complete, end-to-end view of how assets are produced and consumed. |\n| **Local Testing**       | **Challenging.** Running and debugging DAGs locally can be complex and often requires running scheduler/worker components, sometimes via Docker. | **Excellent.** Designed for easy local execution and testing. `prefect run` command makes local development and debugging straightforward. | **Excellent.** The asset model makes it easy to test and materialize individual assets or subgraphs locally. The UI can be run locally for a full-featured development environment. |\n\n## 2. Recommendation for the FDT Project\n\n**Recommendation: Dagster**\n\nWhile all three tools are powerful, **Dagster is the recommended data workflow orchestration tool for the Financial Digital Twin project.** This recommendation is based on its direct alignment with the core strategic goals of the FDT.\n\n### Justification:\n\nThe FDT is fundamentally an initiative focused on **data quality, transparency, and governance**. Its success depends on creating a reliable, auditable, and understandable data ecosystem. Dagster's **asset-centric paradigm** is uniquely suited to this mission.\n\n1.  **Alignment with Governance Goals:** By modeling the FDT's data pipelines as a graph of assets (e.g., `raw_loan_data`, `cleaned_customer_table`, `knowledge_graph_nodes`, `credit_risk_model`), Dagster provides an out-of-the-box, real-time data lineage graph. This is not just a technical feature; it is a critical business capability. It allows data stewards, regulators, and business users to see exactly where a piece of data came from, how it was transformed, and where it is used. This directly supports our goals for BCBS 239 compliance and building a transparent, auditable system.\n\n2.  **Improved Development and Maintenance:** The asset-based approach makes the entire system more maintainable. When a data quality issue arises, an analyst can immediately see the asset in the UI, view its history, understand its dependencies, and trace the problem to its source. This dramatically reduces the time to resolution for data issues compared to a task-centric world where one has to manually trace through execution logs.\n\n3.  **Future-Proofing the Architecture:** As the FDT grows in complexity, managing the \"spaghetti\" of task-based DAGs in Airflow becomes a significant challenge. Dagster's declarative nature, where you define assets and their sources, creates a self-documenting and more scalable architecture that is easier for new team members to understand and contribute to.\n\nWhile Airflow is the mature incumbent and Prefect offers an excellent development experience, Dagster's core philosophy of treating data assets as the primary focus of orchestration provides the strongest foundation for the long-term success, governance, and maintainability of the Financial Digital Twin.\n"
  },
  {
    "id": "08",
    "category": "Architecture & Technology Stack",
    "prompt_title": "Prompt 8: Design the Converged Data Platform",
    "prompt_file": "fdt_prompt_library/architecture_and_technology_stack/08_design_converged_data_platform.md",
    "prompt_content": "### **Prompt 8: Design the Converged Data Platform**\n\n\"Design the architecture for the FDT's **converged data platform**, which combines a data lakehouse with a specialized serving layer.\n1.  Describe the **Foundation Layer**: A data lakehouse (e.g., Databricks on S3/ADLS) and its role as the cost-effective, comprehensive system of record with ACID compliance.\n2.  Describe the **Serving Layer** and its 'polyglot persistence' approach. Detail the purpose of each specialized database:\n    * **Graph Database (e.g., Neo4j):** For the core FDT knowledge graph and relationship analysis.\n    * **Time-Series Database (e.g., TimescaleDB):** For high-frequency market data.\n    * **Search Index (e.g., OpenSearch):** For unstructured text data like documents and news.\"\n",
    "artifact_title": "Architecture Design: The FDT Converged Data Platform",
    "artifact_file": "fdt_artifacts/architecture_and_technology_stack/08_converged_data_platform_architecture.md",
    "artifact_content": "# Architecture Design: The FDT Converged Data Platform\n\nThe Financial Digital Twin (FDT) requires a data platform that is both comprehensive and highly performant. It must be able to cost-effectively store the entire history of the lending business while also serving real-time, complex queries to power simulations and AI agents. A single database technology cannot satisfy these competing requirements. Therefore, we propose a **converged data platform** architecture, which combines a scalable data lakehouse foundation with a specialized, high-performance serving layer.\n\nThis architecture is not about replacing one system with another, but about using the right tool for the right job.\n\n## 1. The Foundation Layer: The Lakehouse as the System of Record\n\nThe bedrock of the FDT is a **Data Lakehouse** built on a scalable cloud object store (e.g., Amazon S3 or Azure Data Lake Storage Gen2) with a transactional layer like Delta Lake or Apache Iceberg, managed via a platform like Databricks or Snowflake.\n\n### Role and Purpose:\n\n*   **Comprehensive System of Record:** The lakehouse will serve as the single source of truth for *all* data\u2014structured, semi-structured, and unstructured. It will ingest raw, untransformed data from every source system, providing a complete historical archive for compliance, audit, and future AI model training.\n*   **Cost-Effective, Scalable Storage:** Leveraging cloud object storage makes it economically feasible to store petabytes of historical data, which would be prohibitively expensive in a traditional data warehouse.\n*   **ACID Compliance and Reliability:** Transactional formats like Delta Lake bring ACID compliance, schema enforcement, and time-travel (data versioning) capabilities to the data lake. This prevents the \"data swamp\" problem and ensures that the data is reliable and auditable.\n*   **Batch Analytics and AI/ML Training:** The lakehouse is the ideal environment for large-scale batch processing, data engineering, and training machine learning models on the complete historical dataset.\n\nData flows from source systems into the lakehouse, where it is refined through a series of stages (e.g., bronze, silver, gold) into clean, well-structured tables. These tables then feed the specialized databases in the serving layer.\n\n## 2. The Serving Layer: Polyglot Persistence for Performance\n\nWhile the lakehouse is the comprehensive system of record, it is not optimized for the low-latency, specialized queries required by the FDT's real-time applications. For this, we employ a **\"polyglot persistence\"** strategy in the serving layer, using multiple, purpose-built databases to serve data to end-users and AI agents.\n\nThis layer is fed with curated, high-value data from the gold tables in the lakehouse.\n\n### Specialized Databases:\n\n*   **Graph Database (e.g., Neo4j)**\n    *   **Purpose:** To house the core **FDT Knowledge Graph**. This is the heart of the FDT, storing the highly interconnected network of customers, loans, collateral, guarantors, and their relationships.\n    *   **Justification:** A graph database is essential for performing the rapid, multi-hop relationship analysis required for counterparty risk detection, fraud ring identification, and contagion analysis. These are queries that are prohibitively slow or impossible in other database types.\n\n*   **Time-Series Database (e.g., TimescaleDB, InfluxDB)**\n    *   **Purpose:** To store and analyze high-frequency, time-stamped data.\n    *   **Justification:** The FDT needs to correlate portfolio performance with external market events. A time-series database is specifically designed to efficiently store and query massive volumes of data like stock prices, interest rate curves, and other market indices. It can answer questions like \"Show me the value of all collateral linked to this loan, bucketed by the minute\" with extreme speed.\n\n*   **Search Index (e.g., OpenSearch, Elasticsearch)**\n    *   **Purpose:** To provide powerful full-text search capabilities over unstructured and semi-structured text data.\n    *   **Justification:** A significant portion of our risk is buried in text: loan agreements, legal documents, news articles, customer emails, and regulatory filings. A search index allows analysts and AI agents to perform complex searches, aggregations, and semantic analysis on this corpus of text data, enabling capabilities like automated covenant monitoring from legal documents or flagging borrowers mentioned in negative news articles.\n\nBy combining the comprehensive storage of the lakehouse with the specialized performance of the polyglot serving layer, this converged architecture provides the scalability, flexibility, and real-time power required to bring the Financial Digital Twin to life.\n"
  },
  {
    "id": "09",
    "category": "Architecture & Technology Stack",
    "prompt_title": "Prompt 9: Compare Enterprise Graph Databases",
    "prompt_file": "fdt_prompt_library/architecture_and_technology_stack/09_compare_enterprise_graph_databases.md",
    "prompt_content": "### **Prompt 9: Compare Enterprise Graph Databases**\n\n\"Provide a detailed comparison of leading enterprise graph databases: **Neo4j, TigerGraph, and Amazon Neptune.**\n1.  Present the comparison in a markdown table, evaluating features like Data Model, Query Language, Scalability Model, Native Graph Data Science capabilities, and Security.\n2.  Conclude with a specific recommendation for the FDT, justifying the choice based on factors like ecosystem maturity, query language intuition, and the comprehensiveness of its data science library.\"\n",
    "artifact_title": "Comparison of Enterprise Graph Databases",
    "artifact_file": "fdt_artifacts/architecture_and_technology_stack/09_enterprise_graph_database_comparison.md",
    "artifact_content": "# Comparison of Enterprise Graph Databases\n\nThe graph database is the heart of the Financial Digital Twin's (FDT) serving layer, powering its real-time relationship analysis capabilities. The choice of this component will have long-term implications for development speed, performance, and the types of analytics we can perform. This document compares three leading enterprise graph databases: Neo4j, TigerGraph, and Amazon Neptune.\n\n## 1. Feature Comparison\n\n| Feature                         | Neo4j                                                                                                          | TigerGraph                                                                                                     | Amazon Neptune                                                                                                 |\n| ------------------------------- | -------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- |\n| **Data Model**                  | **Labeled Property Graph (LPG).** The most widely used and intuitive model for property graphs.                   | **Labeled Property Graph (LPG).** Similar to Neo4j.                                                            | **Supports both Labeled Property Graph (LPG) and Resource Description Framework (RDF).** Offers more flexibility but can add complexity. |\n| **Query Language**              | **Cypher.** A declarative, SQL-like language. Widely considered the most intuitive and easy-to-learn graph query language. It is now an open standard (openCypher). | **GSQL.** A powerful, Turing-complete language that is also SQL-like but is more verbose than Cypher. It combines declarative querying with imperative, procedural control flow. | **Gremlin & SPARQL.** Uses Apache TinkerPop's Gremlin for LPG queries and SPARQL for RDF. Gremlin is a procedural, graph-traversal language, which can be less intuitive for analysts than Cypher. |\n| **Scalability Model**           | **Scale-up (Primary) & Scale-out.** Excellent single-server performance. Causal Clustering provides scale-out for reads. Fabric allows for sharding (federated queries) across multiple databases. | **Scale-out (Native MPP).** Built on a Massively Parallel Processing (MPP) architecture, designed for distributed, scale-out performance for both reads and writes across a cluster. | **Scale-out (Cloud-native).** A managed service that automatically scales read replicas (up to 15) and storage. Separates storage and compute for high availability and performance. |\n| **Native Graph Data Science**   | **Excellent.** The Graph Data Science (GDS) library is the most mature and comprehensive on the market, with over 60 tuned algorithms for tasks like community detection, centrality, and pathfinding. | **Very Good.** Offers a rich library of parallel graph algorithms designed to run in-database directly on the distributed graph. Strong for large-scale graph computation. | **Limited.** Does not have a built-in, comprehensive GDS library comparable to Neo4j or TigerGraph. Running complex graph algorithms often requires exporting data to an external analytics environment (e.g., using Neptune ML). |\n| **Security**                    | **Excellent.** Provides robust, fine-grained security controls, including Role-Based Access Control (RBAC) at the node label, property, and relationship type level. | **Good.** Offers user-defined roles and privileges. Security features are robust but historically have been considered less fine-grained than Neo4j's. | **Excellent (AWS IAM).** As an AWS service, it integrates deeply with AWS Identity and Access Management (IAM) for authentication and fine-grained access control policies. |\n\n## 2. Recommendation for the FDT Project\n\n**Recommendation: Neo4j**\n\nWhile TigerGraph offers superior scalability for massive graphs and Neptune provides seamless AWS integration, **Neo4j is the recommended graph database for the Financial Digital Twin project.**\n\n### Justification:\n\nThis recommendation prioritizes **developer productivity, ecosystem maturity, and the comprehensiveness of its analytical capabilities**, which are the most critical factors for the initial phases of the FDT's development and rollout.\n\n1.  **Intuitive Query Language (Cypher):** The success of the FDT depends on its adoption by a wide range of users, including risk analysts, data scientists, and business users. Cypher's declarative, visual, and easy-to-learn syntax significantly lowers the barrier to entry compared to the more programmatic Gremlin or the more complex GSQL. This will accelerate development and empower our analysts to write their own queries, fostering a data-driven culture.\n\n2.  **Mature Ecosystem and Community:** Neo4j has the largest and most mature ecosystem of any graph database. This includes extensive documentation, a vibrant community, a wide array of connectors (for Kafka, Spark, etc.), and visualization tools (Bloom). This maturity de-risks the project, ensuring we have access to a wealth of resources, talent, and third-party integrations.\n\n3.  **Comprehensive Graph Data Science (GDS) Library:** The FDT's vision extends beyond simple queries to advanced predictive analytics. Neo4j's GDS library is unparalleled in its breadth and maturity. The ability to run powerful algorithms for fraud detection (community detection), risk propagation (centrality), and 'what-if' analysis (pathfinding) directly within the database is a massive accelerator. This native capability will allow our data science team to rapidly develop and deploy the AI-driven features at the heart of the FDT, without the complexity of exporting data to a separate ML platform.\n\nWhile scalability is a consideration, Neo4j's clustering and Fabric architecture are more than sufficient to handle the scale of our lending portfolio for the foreseeable future. The overwhelming advantages in ease of use, ecosystem support, and native analytics make it the most pragmatic and powerful choice to ensure the success of the FDT.\n"
  },
  {
    "id": "10",
    "category": "Architecture & Technology Stack",
    "prompt_title": "Prompt 10: Generate a Risk Analysis Cypher Query",
    "prompt_file": "fdt_prompt_library/architecture_and_technology_stack/10_generate_risk_analysis_cypher_query.md",
    "prompt_content": "### **Prompt 10: Generate a Risk Analysis Cypher Query**\n\n\"You are a risk analyst using the Financial Digital Twin. Write a **Cypher query** for the Neo4j graph database to perform a multi-hop counterparty risk assessment.\n**Scenario:** Assess the total exposure to a borrower named 'Global Megacorp.'\nThe query must:\n1.  Find the borrower 'Global Megacorp.'\n2.  Identify all guarantors connected to this borrower.\n3.  Find any publicly traded stock owned by those guarantors.\n4.  Aggregate the borrower's direct exposure from its active loans.\n5.  Return a summary of the borrower's name, its total direct exposure, a list of its guarantors, and the stock symbols it's indirectly exposed to via those guarantors.\"\n",
    "artifact_title": "Cypher Query for Multi-Hop Counterparty Risk Assessment",
    "artifact_file": "fdt_artifacts/architecture_and_technology_stack/10_risk_analysis_cypher_query.md",
    "artifact_content": "# Cypher Query for Multi-Hop Counterparty Risk Assessment\n\nThis artifact provides a practical example of how the Financial Digital Twin's knowledge graph can be queried to perform a complex, multi-hop counterparty risk assessment. The query is written in Cypher for the Neo4j graph database.\n\n## Scenario\n\nAn analyst needs to assess the institution's total exposure to a key borrower, 'Global Megacorp.' This assessment must include not only the direct loans to the company but also the indirect risks posed by its network of guarantors. Specifically, the analyst wants to know what publicly traded stocks the institution is indirectly exposed to through these guarantor relationships.\n\n## Query Logic\n\nThe following Cypher query executes the analysis by traversing the graph in a series of steps:\n\n1.  **MATCH** the `Borrower` node for 'Global Megacorp'.\n2.  **OPTIONAL MATCH** to find all active `Loan` nodes directly connected to the borrower and calculate the sum of their outstanding balances to get the `totalDirectExposure`. An `OPTIONAL MATCH` is used in case the borrower has no active loans.\n3.  **OPTIONAL MATCH** to traverse the graph from the borrower to its `Guarantor` nodes.\n4.  **OPTIONAL MATCH** from those `Guarantor` nodes to any `Stock` assets they `OWNS`.\n5.  **RETURN** a consolidated summary, using `collect(DISTINCT ...)` to aggregate the names of the guarantors and the stock symbols into lists.\n\n## Cypher Query\n\n```cypher\n// Scenario: Assess the total direct and indirect exposure of 'Global Megacorp'\n\n// 1. Find the borrower\nMATCH (borrower:Company {name: 'Global Megacorp'})\n\n// 4. Aggregate the borrower's direct exposure from its active loans\nOPTIONAL MATCH (borrower)-[:HAS_LOAN]->(loan:Loan {status: 'Active'})\nWITH borrower, sum(loan.outstandingAmount) AS totalDirectExposure\n\n// 2. Identify all guarantors connected to this borrower\nOPTIONAL MATCH (borrower)<-[:GUARANTEES]-(guarantor:Entity)\nWITH borrower, totalDirectExposure, collect(DISTINCT guarantor.name) AS guarantorNames\n\n// 3. Find any publicly traded stock owned by those guarantors\n// We must unwind the collected guarantor names to perform the next match\nUNWIND CASE WHEN size(guarantorNames) = 0 THEN [null] ELSE guarantorNames END AS guarantorName\nOPTIONAL MATCH (g:Entity {name: guarantorName})-[:OWNS]->(stock:Stock)\nWITH borrower, totalDirectExposure, guarantorNames, collect(DISTINCT stock.ticker) AS indirectStockExposure\n\n// 5. Return a summary\nRETURN\n  borrower.name AS borrowerName,\n  totalDirectExposure,\n  guarantorNames,\n  indirectStockExposure\n```\n\n### Example Result\n\nThis query would return a single row of data that looks like this, providing the analyst with a rich, multi-dimensional view of the counterparty risk in seconds:\n\n```json\n{\n  \"borrowerName\": \"Global Megacorp\",\n  \"totalDirectExposure\": 75000000,\n  \"guarantorNames\": [\n    \"International Holdings Inc.\",\n    \"Peter Schmidt\"\n  ],\n  \"indirectStockExposure\": [\n    \"AAPL\",\n    \"GOOG\",\n    \"TSLA\"\n  ]\n}\n```\n"
  },
  {
    "id": "11",
    "category": "AI, Agents, & Analytics",
    "prompt_title": "Prompt 11: Design the Agentic Framework",
    "prompt_file": "fdt_prompt_library/ai_agents_and_analytics/11_design_agentic_framework.md",
    "prompt_content": "### **Prompt 11: Design the Agentic Framework**\n\n\"Design a **multi-agent system** to power the FDT's intelligent automation.\n1.  Explain the shift from monolithic applications to a collaborative system of autonomous AI agents.\n2.  Define the core **agent personas** and their specific responsibilities:\n    * **Credit Risk Agent:** Monitors portfolio credit quality.\n    * **Fraud Detection Agent:** Uses GNNs to find fraud rings.\n    * **Compliance Agent:** Screens against watchlists and monitors for SAR triggers.\n    * **Market Intelligence Agent:** Analyzes unstructured news and market data.\n    * **Query Agent:** Provides the natural language interface.\n3.  Describe how these agents would collaborate using a 'Supervisor' agent pattern to answer a complex user query, such as: 'Show me our highest-risk loans exposed to the recent downturn in commercial real estate.'\"\n",
    "artifact_title": "Design for the FDT Agentic Framework",
    "artifact_file": "fdt_artifacts/ai_agents_and_analytics/11_agentic_framework_design.md",
    "artifact_content": "# Design for the FDT Agentic Framework\n\nThe intelligence layer of the Financial Digital Twin (FDT) will be powered by a sophisticated **multi-agent system**. This represents a deliberate architectural shift away from rigid, monolithic applications toward a flexible and collaborative system of autonomous AI agents.\n\n## 1. The Shift to a Collaborative Agent System\n\nTraditional financial applications are monolithic. A single, large application handles all tasks, from data ingestion to analysis and user interface. This approach is brittle, difficult to update, and cannot adapt to the complexity and speed of modern data.\n\nThe FDT adopts an **agentic framework**, where the work is decomposed and assigned to a team of specialized, autonomous AI agents. Each agent is an expert in a specific domain. They are independent programs that can perceive their environment (the knowledge graph), reason about what to do, and take action to achieve their goals. They communicate and collaborate with each other to solve problems that are beyond the scope of any single agent. This approach is more resilient, scalable, and adaptable.\n\n## 2. Core Agent Personas\n\nThe FDT will launch with a core set of agent personas, each with a clearly defined role and responsibilities. These agents operate continuously on the FDT's knowledge graph.\n\n*   **Credit Risk Agent:**\n    *   **Responsibilities:** Continuously monitors the credit quality of the portfolio. It watches for breaches of `LoanCovenant` nodes, analyzes changes in `InternalRiskScore`, and alerts human analysts to deteriorating credit conditions. It can proactively run stress tests on sub-portfolios when it detects negative trends.\n\n*   **Fraud Detection Agent:**\n    *   **Responsibilities:** Uses Graph Neural Networks (GNNs) to identify patterns of coordinated fraudulent behavior. It looks for anomalies in the graph structure, such as the formation of dense clusters of new accounts linked by a single address or phone number (a potential synthetic identity ring), or unusual transaction patterns that deviate from historical norms.\n\n*   **Compliance Agent:**\n    *   **Responsibilities:** Ensures the institution adheres to regulatory requirements. It screens all new and existing entities against `Watchlist` nodes (e.g., OFAC sanctions lists) in real-time. It also monitors transaction patterns for behaviors that may require the filing of a Suspicious Activity Report (SAR), automatically flagging them for review.\n\n*   **Market Intelligence Agent:**\n    *   **Responsibilities:** Acts as the FDT's eyes and ears on the outside world. It ingests and analyzes unstructured data from news feeds, regulatory announcements, and market data providers. It uses Natural Language Processing (NLP) to identify events relevant to the portfolio (e.g., a negative news article about a borrower, a bankruptcy filing by a key supplier) and adds this context to the knowledge graph.\n\n*   **Query Agent:**\n    *   **Responsibilities:** Serves as the primary interface between human users and the FDT. It uses a Text-to-Cypher Large Language Model (LLM) to translate natural language questions from analysts into precise Cypher queries that can be executed against the knowledge graph.\n\n## 3. Collaboration via a Supervisor Agent Pattern\n\nThese agents achieve complex tasks by working together, coordinated by a **Supervisor Agent**. The Supervisor decomposes a complex user request into a series of sub-tasks and delegates them to the appropriate specialist agents.\n\n**Scenario:** A user asks the Query Agent: *\"Show me our highest-risk loans exposed to the recent downturn in commercial real estate.\"*\n\nThe collaboration would proceed as follows:\n\n1.  **User to Query Agent:** The user poses the question in natural language. The **Query Agent** parses this and understands the user's intent. It recognizes this is too complex for a single Cypher query.\n\n2.  **Query Agent to Supervisor Agent:** The Query Agent passes the structured request to the **Supervisor Agent**.\n\n3.  **Supervisor Decomposes Task:** The Supervisor Agent breaks the request into a logical plan:\n    a.  *Step 1: Identify relevant market trend.* \"What constitutes the 'recent downturn in commercial real estate'?\"\n    b.  *Step 2: Find exposed borrowers.* \"Which borrowers in our portfolio are tied to the commercial real estate sector?\"\n    c.  *Step 3: Assess the risk of those borrowers.* \"What is the current risk rating of these exposed borrowers?\"\n    d.  *Step 4: Identify the specific loans.* \"Which active loans belong to the highest-risk borrowers identified in the previous step?\"\n    e.  *Step 5: Synthesize and report.* \"Combine the findings into a clear, summary report for the user.\"\n\n4.  **Supervisor Delegates to Specialists:**\n    *   For Step 1, the Supervisor tasks the **Market Intelligence Agent**: \"Find recent negative sentiment and data related to the commercial real estate sector.\" The agent returns key data points, like falling property values in specific regions.\n    *   For Step 2 & 3, the Supervisor tasks the **Credit Risk Agent**: \"Using the context from the Market Agent, find all borrowers in the commercial real estate sector and re-evaluate their `InternalRiskScore`. Return a list of borrowers now classified as 'High' or 'Very High' risk.\"\n    *   For Step 4, the Supervisor uses the list of high-risk borrowers from the Credit Risk Agent to perform a direct query on the graph to retrieve the associated loans.\n\n5.  **Supervisor to Query Agent:** The Supervisor consolidates the results from all agents into a structured answer. It passes this structured data back to the **Query Agent**.\n\n6.  **Query Agent to User:** The Query Agent uses its LLM to synthesize the structured data into a human-readable, narrative response, presenting the list of loans, the reasons they are considered high-risk, and the relevant market data.\n\nThis collaborative, supervised approach allows the FDT to answer incredibly complex, multi-faceted questions by combining the specialized expertise of its distributed AI agents.\n"
  },
  {
    "id": "12",
    "category": "AI, Agents, & Analytics",
    "prompt_title": "Prompt 12: Explain the Text-to-Cypher Engine",
    "prompt_file": "fdt_prompt_library/ai_agents_and_analytics/12_explain_text_to_cypher_engine.md",
    "prompt_content": "### **Prompt 12: Explain the Text-to-Cypher Engine**\n\n\"Explain the architecture and workflow of a **Text-to-Cypher** engine that serves as the FDT's natural language interface.\nThe process should include:\n1.  **Schema-Aware Prompting:** How the system provides the graph's schema to an LLM as context.\n2.  **Few-Shot Learning:** How example question/query pairs are used to improve accuracy.\n3.  **LLM-Powered Translation:** The role of the LLM (e.g., GPT-4o) in generating the Cypher query.\n4.  **Secure Execution:** The critical step of executing the generated query in the database (not the LLM) to enforce user permissions.\n5.  **Synthesized Response:** How the LLM synthesizes the structured data from the query result into a human-readable answer.\"\n",
    "artifact_title": "Architecture of the Text-to-Cypher Engine",
    "artifact_file": "fdt_artifacts/ai_agents_and_analytics/12_text_to_cypher_engine_architecture.md",
    "artifact_content": "# Architecture of the Text-to-Cypher Engine\n\nThe Text-to-Cypher engine is the core component of the FDT's **Query Agent**, providing the natural language interface that makes the power of the knowledge graph accessible to non-technical users like risk analysts and loan officers. This engine converts a user's plain-English question into a precise Cypher query, executes it securely, and synthesizes the results into a coherent answer.\n\nThe architecture is designed around a critical principle: **the Large Language Model (LLM) is a reasoning engine, not a data store.** The LLM never touches the sensitive data directly; it only ever sees the graph's schema and the data returned from a query it helped to create.\n\n## The Five-Step Workflow\n\nThe process of answering a user's question, such as \"Which of our borrowers in the manufacturing sector have loans that are past due?\", follows a secure, five-step workflow.\n\n### 1. Schema-Aware Prompting\n\nWhen the user submits their question, the Query Agent does not simply pass the raw question to the LLM. Instead, it constructs a detailed **meta-prompt**. This prompt provides the LLM with the necessary context to generate an accurate query. The prompt includes:\n\n*   **The User's Question:** \"Which of our borrowers in the manufacturing sector have loans that are past due?\"\n*   **The Graph Schema:** A compact, text-based representation of the relevant parts of the knowledge graph schema. This includes the node labels (`Company`, `Loan`), their properties (`industry`, `status`), and the relationships between them (`-[:HAS_LOAN]->`).\n*   **The Goal:** A clear instruction, e.g., \"You are a Cypher expert. Your task is to translate the user's question into a valid Cypher query based on the provided schema.\"\n\n### 2. Few-Shot Learning\n\nTo dramatically improve the accuracy and consistency of the generated Cypher, the meta-prompt also includes several examples of valid question-and-query pairs. This technique, known as **few-shot learning**, primes the LLM with successful patterns.\n\n*   **Example 1:**\n    *   *Question:* \"How many loans does John Doe have?\"\n    *   *Query:* `MATCH (c:Customer {name: 'John Doe'})-[:HAS_LOAN]->(l:Loan) RETURN count(l)`\n*   **Example 2:**\n    *   *Question:* \"Find companies in the 'Technology' sector.\"\n    *   *Query:* `MATCH (c:Company {industry: 'Technology'}) RETURN c.name`\n\nThese examples guide the LLM to produce syntactically correct and semantically appropriate queries that conform to our specific graph schema.\n\n### 3. LLM-Powered Translation\n\nWith the full context of the schema and the few-shot examples, the Query Agent sends the meta-prompt to the LLM (e.g., GPT-4o, Llama 3). The LLM processes this information and returns a single output: a string containing the generated Cypher query.\n\n*   **LLM Output (a string):** `MATCH (c:Company {industry: 'Manufacturing'})-[:HAS_LOAN]->(l:Loan {status: 'Past Due'}) RETURN c.name AS borrowerName, l.id AS loanId`\n\nCrucially, this is the *only* output from the LLM at this stage.\n\n### 4. Secure Execution\n\nThis is the most critical step for security and data governance. **The LLM does not execute the query.** The Query Agent receives the Cypher string from the LLM and passes it to the Neo4j database for execution.\n\nThis enforces the FDT's security model:\n\n*   **Authentication & Authorization:** The query is executed against the database under the credentials of the logged-in user. The database's native Role-Based Access Control (RBAC) will prevent the query from running if the user does not have permission to view the requested data (e.g., a loan officer cannot see data from a different region).\n*   **Data Stays in the Database:** Sensitive data is never sent to the third-party LLM service. The LLM only ever sees the schema, not the instance data.\n\nThe database executes the query and returns a structured data result (e.g., a JSON object or a table) to the Query Agent.\n\n### 5. Synthesized Response\n\nThe Query Agent now has the final piece of the puzzle: the structured data returned from the database. It initiates a second, final call to the LLM. The prompt for this call includes:\n\n*   **The Original Question:** \"Which of our borrowers in the manufacturing sector have loans that are past due?\"\n*   **The Query Result:** The JSON data from the database.\n*   **The Goal:** \"You are a helpful analyst. Synthesize the following data into a clear, human-readable answer to the user's original question.\"\n\nThe LLM then generates a natural language summary for the user.\n\n*   **Final Output to User:** \"There are two borrowers in the manufacturing sector with past-due loans. The borrowers are 'Global Widgets Inc.' (Loan ID: L-789) and 'Advanced Materials LLC' (Loan ID: L-456).\"\n\nThis five-step process ensures that the Text-to-Cypher engine is not only powerful and user-friendly but also secure, auditable, and compliant with the institution's data governance policies.\n"
  },
  {
    "id": "13",
    "category": "AI, Agents, & Analytics",
    "prompt_title": "Prompt 13: Propose Advanced Analytics Capabilities",
    "prompt_file": "fdt_prompt_library/ai_agents_and_analytics/13_propose_advanced_analytics_capabilities.md",
    "prompt_content": "### **Prompt 13: Propose Advanced Analytics Capabilities**\n\n\"Outline a strategy for implementing advanced analytics and simulation in the FDT, moving from prediction to causation.\n1.  **Predictive Analytics:** Describe the use of **Graph Neural Networks (GNNs)** for sophisticated fraud detection, explaining how they learn from network topology to identify coordinated fraud rings.\n2.  **Causal Inference:** Explain how **Causal Inference** and Causal DAGs will be used for true 'what-if' counterfactual portfolio simulation, allowing analysts to understand *why* events happen, not just predict that they will.\n3.  **Explainable AI (XAI):** Detail the necessity of XAI techniques (like GNNExplainer) to ensure model transparency for business users, auditors, and regulators.\"\n",
    "artifact_title": "Strategy for Advanced Analytics and Simulation",
    "artifact_file": "fdt_artifacts/ai_agents_and_analytics/13_advanced_analytics_strategy.md",
    "artifact_content": "# Strategy for Advanced Analytics and Simulation\n\nTo fully realize the vision of shifting from \"Hindsight to Foresight,\" the Financial Digital Twin (FDT) must move beyond simple reporting and embrace a suite of advanced analytical capabilities. This strategy outlines the progression from powerful predictive models to true causal understanding, all while maintaining the transparency required in a regulated industry.\n\n## 1. Predictive Analytics: Graph Neural Networks for Fraud Detection\n\nStandard predictive models often fail to detect sophisticated fraud because they analyze entities in isolation. Coordinated fraud rings, however, are a network problem. **Graph Neural Networks (GNNs)** are a state-of-the-art machine learning technique designed specifically to learn from network topology, making them uniquely suited for this challenge.\n\n### The Strategy:\n\n*   **Learning from Connections:** The FDT's **Fraud Detection Agent** will employ a GNN model trained on the knowledge graph. The GNN learns not just from the attributes of an entity (e.g., a new customer's stated income) but from the structure of its connections. It learns to recognize the subtle graph patterns that indicate fraud, such as:\n    *   A high density of new accounts sharing a single device ID, address, or phone number.\n    *   Unusual \"synthetic\" identities that are connected to the rest of the graph in sparse or atypical ways.\n    *   Circular transaction patterns designed to artificially inflate creditworthiness.\n*   **Proactive Identification:** By learning these topological \"fingerprints\" of fraud, the GNN can score new applications or existing accounts based on their network structure, flagging potential fraud rings for investigation long before a financial loss occurs.\n\n## 2. Causal Inference: True \"What-If\" Portfolio Simulation\n\nPredictive models are excellent at identifying correlations (e.g., customers with low credit scores are *more likely* to default), but they cannot explain causation. They cannot tell us *why* an event happened or what would happen if we intervened. To achieve true \"what-if\" simulation, the FDT will incorporate **Causal Inference**.\n\n### The Strategy:\n\n*   **Building a Causal DAG:** We will work with subject matter experts to model the causal relationships within the lending ecosystem as a Directed Acyclic Graph (DAG). This Causal DAG represents our hypotheses about what causes what (e.g., `Interest Rate Increase` -> `Reduced Borrower Cash Flow` -> `Increased Probability of Default`).\n*   **Counterfactual Simulation:** Using this causal model, analysts can move beyond simple correlation-based stress tests to ask deep **counterfactual questions**. Instead of just asking \"What happens if we assume a 10% default rate in the construction sector?\", an analyst can ask:\n    *   \"What would the impact on our portfolio be if the Federal Reserve *had not* raised interest rates last quarter?\"\n    *   \"What is the projected impact on our credit losses if we *intervene* by offering a three-month forbearance to all borrowers in a flood-affected region?\"\n*   **From Prediction to Control:** Causal inference allows the FDT to become a true decision-making tool. It provides a framework for understanding the *levers* that can be pulled to influence outcomes, moving the institution from a reactive to a proactive and strategic risk posture.\n\n## 3. Explainable AI (XAI): Ensuring Model Transparency\n\nIn a regulated industry like finance, a \"black box\" AI model is unacceptable. Every significant decision made or supported by the FDT must be explainable to business users, auditors, and regulators. **Explainable AI (XAI)** is therefore not an optional add-on but a core requirement.\n\n### The Strategy:\n\n*   **Implementing GNNExplainer:** For our GNN-based models, we will implement techniques like **GNNExplainer**. When the GNN flags a customer or a transaction as high-risk, GNNExplainer can identify the most influential subgraph and node features that led to that prediction.\n*   **From \"What\" to \"Why\":** Instead of simply telling an analyst, \"This application is flagged for fraud (95% probability),\" the FDT will provide a human-readable explanation:\n    > \"This application is flagged for potential synthetic identity fraud because:\n    > 1.  The applicant shares a device ID with **7 other recent applications** (see subgraph).\n    > 2.  The provided address was created **less than 30 days ago**.\n    > 3.  The applicant's social graph has **no connection** to any other established customers in our portfolio.\"\n*   **Building Trust and Accountability:** By integrating XAI into every AI-driven workflow, we ensure that our models are not only accurate but also transparent and trustworthy. This builds confidence among users, satisfies regulatory demands for model interpretability, and creates a system that is ultimately accountable.\n"
  },
  {
    "id": "14",
    "category": "Roadmap & Governance",
    "prompt_title": "Prompt 14: Create the Phased Implementation Roadmap",
    "prompt_file": "fdt_prompt_library/roadmap_and_governance/14_create_phased_implementation_roadmap.md",
    "prompt_content": "### **Prompt 14: Create the Phased Implementation Roadmap**\n\n\"Create a three-year, phased implementation roadmap for the Financial Digital Twin. Present this as a markdown table with the following columns: **Phase**, **Timeline**, **Key Objectives**, **Core Activities**, **Key Deliverables**, and **Success Metrics (KPIs)**.\n* **Phase 1 (Year 1): Foundational Layer & Core Use Case.** Focus on setting up the infrastructure, ontology, and delivering a consolidated counterparty exposure report.\n* **Phase 2 (Year 2): Advanced Analytics & Agentic Capabilities.** Focus on deploying GNN models, AI agents, and a natural language interface pilot.\n* **Phase 3 (Year 3): Enterprise Expansion & Causal Simulation.** Focus on full enterprise rollout and introducing causal inference for portfolio simulation.\"\n",
    "artifact_title": "Phased Implementation Roadmap for the Financial Digital Twin",
    "artifact_file": "fdt_artifacts/roadmap_and_governance/14_implementation_roadmap.md",
    "artifact_content": "# Phased Implementation Roadmap for the Financial Digital Twin\n\nThe development and rollout of the Financial Digital Twin (FDT) will be managed through a strategic, three-year phased roadmap. This approach allows us to deliver incremental value, manage complexity, and adapt to business needs over time. Each phase has a clear set of objectives, activities, and success metrics.\n\n| Phase   | Timeline | Key Objectives                                                                                                   | Core Activities                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                -                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       -\n| Phase 1 | **Year 1** | **Foundational Layer & Core Use Case**                                                                           | \u2022 **Infrastructure Setup:** Provision cloud resources, data lakehouse, and graph database.<br>\u2022 **Data Ingestion:** Build pipelines to ingest data from critical source systems (LOS, Servicing) into the lakehouse.<br>\u2022 **Ontology V1:** Develop the initial proprietary lending ontology and map source data to it.<br>\u2022 **Graph Loading:** Populate the Neo4j graph with the core entities (customers, loans, collateral).<br>\u2022 **Core Application:** Develop and launch the first FDT application: a consolidated counterparty exposure report. | \u2022 **Deliverable:** A live, interactive report showing the total, 360-degree exposure for any given borrower.<br>\u2022 **KPI:** Reduce time to generate a complex counterparty exposure report from 3 days to under 1 minute. |\n| Phase 2 | **Year 2** | **Advanced Analytics & Agentic Capabilities**                                                                    | \u2022 **Deploy GNN Models:** Train and deploy the first graph neural network model for fraud detection.<br>\u2022 **Develop Agent Framework:** Build the core multi-agent system, including the Supervisor, Credit Risk, and Fraud Detection agents.<br>\u2022 **Natural Language Interface (Pilot):** Implement the Text-to-Cypher engine and launch a pilot with a select group of risk analysts.<br>\u2022 **Expand Data Sources:** Integrate external data sources, including market data and news feeds. | \u2022 **Deliverable:** A live fraud detection dashboard flagging suspicious applications in real-time.<br>\u2022 **Deliverable:** A functioning natural language chatbot for querying the FDT.<br>\u2022 **KPI:** Identify at least 2 previously undetected fraud rings within the first 6 months.<br>\u2022 **KPI:** Achieve an 80% success rate on user queries in the NLP pilot. |\n| Phase 3 | **Year 3** | **Enterprise Expansion & Causal Simulation**                                                                       | \u2022 **Full Enterprise Rollout:** Expand access to the FDT and its applications across the entire lending organization.<br>\u2022 **Causal Inference Model:** Develop and deploy the first Causal DAG for portfolio simulation.<br>\u2022 **Automated Compliance:** Fully automate SAR and BCBS 239 reporting via the Compliance Agent.<br>\u2022 **Bi-directional Integration:** Evolve pipelines to allow the FDT to write insights back to source systems (e.g., updating a customer's risk score in the servicing platform). | \u2022 **Deliverable:** A \"what-if\" simulation tool for portfolio stress testing based on causal inference.<br>\u2022 **Deliverable:** Fully automated generation of the quarterly BCBS 239 report.<br>\u2022 **KPI:** Achieve 95% user adoption within the lending organization.<br>\u2022 **KPI:** Reduce audit preparation time for key reports by 60%. |\n"
  },
  {
    "id": "15",
    "category": "Roadmap & Governance",
    "prompt_title": "Prompt 15: Design the Governance & Security Framework",
    "prompt_file": "fdt_prompt_library/roadmap_and_governance/15_design_governance_and_security_framework.md",
    "prompt_content": "### **Prompt 15: Design the Governance & Security Framework**\n\n\"Outline a comprehensive **governance, security, and compliance framework** for the FDT.\n1.  **Data Governance:** Define the key roles (Data Owners, Data Stewards) and the function of a Data Governance Council. Mention the importance of automated data lineage.\n2.  **BCBS 239 Compliance:** Explain how the FDT's architecture directly addresses key principles of BCBS 239 (e.g., Completeness, Timeliness, Adaptability).\n3.  **Graph-Native Security:** Describe a multi-layered security model using Role-Based Access Control (RBAC) with fine-grained permissions at the node and property level in Neo4j. Provide a sample rule (e.g., a Loan Officer can see a loan but not the sensitive details of a sanctions screening result).\n4.  **LLMOps:** Detail the strategies for managing the security of the Text-to-Cypher LLM, including prompt injection prevention and ensuring the LLM never directly accesses sensitive data.\"\n",
    "artifact_title": "FDT Governance, Security, and Compliance Framework",
    "artifact_file": "fdt_artifacts/roadmap_and_governance/15_governance_and_security_framework.md",
    "artifact_content": "# FDT Governance, Security, and Compliance Framework\n\nThe Financial Digital Twin (FDT) will manage the institution's most sensitive and critical data. Therefore, it must be governed by a comprehensive framework that ensures data quality, enforces security, and guarantees regulatory compliance. This framework is not an afterthought but a foundational pillar of the FDT program.\n\n## 1. Data Governance\n\nA formal Data Governance program will be established to manage the FDT's data as a strategic enterprise asset.\n\n### Key Roles and Responsibilities:\n\n*   **Data Owners:** Senior business leaders (e.g., Chief Credit Officer) who are ultimately accountable for the quality and ethical use of a specific data domain (e.g., credit risk data). They set policies for their domain.\n*   **Data Stewards:** Subject matter experts, assigned by Data Owners, who are responsible for the day-to-day management of data. They define data quality rules, manage metadata, and are the go-to experts for their specific data assets (e.g., the `Loan` entity in the knowledge graph).\n\n### Data Governance Council:\n\nA cross-functional **Data Governance Council**, chaired by the Chief Data Officer, will be formed. It will be composed of Data Owners and key stakeholders from IT, security, compliance, and legal. The council's mandate is to:\n*   Ratify data policies and standards.\n*   Resolve data-related issues and conflicts.\n*   Prioritize data quality and governance initiatives.\n*   Oversee the FDT's adherence to the governance framework.\n\n**Automated Data Lineage:** The FDT's choice of Dagster as its orchestration tool is key to governance. The asset-based lineage it generates provides an automated, always-up-to-date map of how data flows and is transformed, which is critical for auditability and trust.\n\n## 2. BCBS 239 Compliance\n\nThe FDT's architecture is designed from the ground up to address the core principles of the Basel Committee on Banking Supervision's standard 239 (BCBS 239) for risk data aggregation and reporting.\n\n*   **Completeness:** By ingesting data from all relevant source systems into the lakehouse and knowledge graph, the FDT ensures that risk calculations are based on a complete and holistic view of the business.\n*   **Timeliness:** The converged data platform, with its real-time serving layer, enables the FDT to generate risk reports on demand, moving from slow, batch-based reporting to real-time situational awareness.\n*   **Adaptability:** The flexible nature of the knowledge graph and the modularity of the agentic framework allow the FDT to be quickly adapted to new regulations, business requirements, or market conditions without requiring a complete architectural overhaul.\n*   **Accuracy & Integrity:** The Data Governance framework, coupled with automated data quality checks within the data pipelines, ensures the accuracy and integrity of the data used for risk reporting. Data lineage provides a clear audit trail.\n\n## 3. Graph-Native Security\n\nSecurity will be enforced at multiple layers, with the graph database at the core providing fine-grained access control. We will implement a **Role-Based Access Control (RBAC)** model directly within Neo4j.\n\n### Multi-Layered Model:\n\n1.  **Authentication:** All users and applications must authenticate with the database.\n2.  **Authorization (RBAC):** Once authenticated, users are assigned roles that grant specific privileges. These privileges are not just at the database level but can be defined at the graph level.\n3.  **Fine-Grained Permissions:** Neo4j's RBAC allows us to control access to specific nodes, relationships, and even properties within the graph.\n\n### Sample RBAC Rule:\n\n*   **Role:** `Loan_Officer_US_West`\n*   **Permissions:**\n    *   `GRANT READ ON GRAPH lending_db` - Allows the user to read from the database.\n    *   `GRANT MATCH {region: 'US-West'} ON NODES Customer, Loan, Collateral` - Allows the user to `MATCH` and view Customer, Loan, and Collateral nodes, but *only* where the `region` property is 'US-West'.\n    *   `DENY READ ON PROPERTY {sensitiveScreeningDetails} ON NODES SanctionsScreeningResult` - Explicitly denies the user the ability to read the sensitive details of a sanctions screening result, even if they can see that a screening took place.\n\nThis granular, graph-native approach ensures that users can only access the specific data \"slice\" they are authorized to see, enforcing the principle of least privilege.\n\n## 4. LLMOps: Managing LLM Security\n\nThe Text-to-Cypher engine, while powerful, introduces a new potential attack surface. Our LLMOps strategy is designed to mitigate these risks.\n\n*   **Prompt Injection Prevention:** The Query Agent will implement strict input validation and sanitization on all user-provided text. It will also use delimiters and explicit instruction formatting in the meta-prompt to reduce the risk of a malicious user crafting a prompt that could cause the LLM to ignore its original instructions.\n*   **Never Trust, Always Verify:** The Query Agent will include a validation step to inspect the Cypher query returned by the LLM before execution. It will check for potentially destructive or unauthorized commands (e.g., `DELETE`, `DETACH`, `CALL db.schema()`). Any query that fails validation will be rejected.\n*   **No Direct Data Access:** As detailed in the Text-to-Cypher architecture, the LLM will **never** be granted direct access to the database. Its role is strictly limited to translating text to Cypher and synthesizing results, based only on the schema and the data provided to it by the secure Query Agent.\n*   **Regular Auditing:** All user questions and LLM-generated queries will be logged for regular auditing. This allows us to monitor for misuse, identify patterns of failed queries, and continuously improve the safety and accuracy of the system.\n"
  }
]
