CocoIndex makes it easy to build and maintain knowledge graphs with continuous source updates. In this blog, we will process a list of documents (using CocoIndex documentation as an example). We will use LLM to extract relationships between the concepts in each document. We will generate two kinds of relationships: Relationships between subjects and objects. E.g., "CocoIndex supports Incremental Processing" Mentions of entities in a document. E.g., "core/basics.mdx" mentions CocoIndex and Incremental Processing. The source code is available at CocoIndex Examples - docs_to_knowledge_graph. We are constantly improving, and more features and examples are coming soon. Stay tuned and follow our progress by starring our GitHub repo. Prerequisites Documentation You can read the official CocoIndex Documentation for Property Graph Targets here. Data flow to build knowledge graph Add documents as source We will process CocoIndex documentation markdown files (.md, .mdx) from the docs/core directory (markdown files, deployed docs). @cocoindex.flow_def(name="DocsToKG")def docs_to_kg_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope): data_scope["documents"] = flow_builder.add_source( cocoindex.sources.LocalFile(path="../../docs/docs/core", included_patterns=["*.md", "*.mdx"])) Here flow_builder.add_source creates a KTable. filename is the key of the KTable. Add data collectors Add collectors at the root scope: document_node = data_scope.add_collector()entity_relationship = data_scope.add_collector()entity_mention = data_scope.add_collector() document_node collects documents. E.g., core/basics.mdx is a document. entity_relationship collects relationships. E.g., "CocoIndex supports Incremental Processing" indicates a relationship between CocoIndex and Incremental Processing. entity_mention collects mentions of entities in a document. E.g., core/basics.mdx mentions CocoIndex and Incremental Processing. Process each document and extract summary Define a...
First seen: 2025-05-13 20:32
Last seen: 2025-05-14 03:33