Build Real-Time Knowledge Graph for Documents with LLM

https://news.ycombinator.com/rss Hits: 8
Summary

CocoIndex makes it easy to build and maintain knowledge graphs with continuous source updates. In this blog, we will process a list of documents (using CocoIndex documentation as an example). We will use LLM to extract relationships between the concepts in each document. We will generate two kinds of relationships: Relationships between subjects and objects. E.g., "CocoIndex supports Incremental Processing" Mentions of entities in a document. E.g., "core/basics.mdx" mentions CocoIndex and Incremental Processing. The source code is available at CocoIndex Examples - docs_to_knowledge_graph. We are constantly improving, and more features and examples are coming soon. Stay tuned and follow our progress by starring our GitHub repo. Prerequisites​ Documentation​ You can read the official CocoIndex Documentation for Property Graph Targets here. Data flow to build knowledge graph​ Add documents as source​ We will process CocoIndex documentation markdown files (.md, .mdx) from the docs/core directory (markdown files, deployed docs). @cocoindex.flow_def(name="DocsToKG")def docs_to_kg_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope): data_scope["documents"] = flow_builder.add_source( cocoindex.sources.LocalFile(path="../../docs/docs/core", included_patterns=["*.md", "*.mdx"])) Here flow_builder.add_source creates a KTable. filename is the key of the KTable. Add data collectors​ Add collectors at the root scope: document_node = data_scope.add_collector()entity_relationship = data_scope.add_collector()entity_mention = data_scope.add_collector() document_node collects documents. E.g., core/basics.mdx is a document. entity_relationship collects relationships. E.g., "CocoIndex supports Incremental Processing" indicates a relationship between CocoIndex and Incremental Processing. entity_mention collects mentions of entities in a document. E.g., core/basics.mdx mentions CocoIndex and Incremental Processing. Process each document and extract summary​ Define a...

First seen: 2025-05-13 20:32

Last seen: 2025-05-14 03:33