Back to Home
KnowFlow

KnowFlow

Demo Video

KnowFlow

KnowFlow is a powerful hybrid Retrieval-Augmented Generation (RAG) system that combines semantic search with knowledge graph capabilities for intelligent document processing and querying.

🌟 Features

  • Advanced Document Processing

    • Multi-format support (PDF, DOCX, CSV, TXT)
    • Intelligent chunking with configurable size and overlap
    • Parallel batch processing with S3 storage
    • Document status tracking (PENDING, PROCESSING, INDEXED, FAILED)
    • Secure per-user document isolation
  • Hybrid RAG + Knowledge Graph Architecture

    • Dense semantic embeddings via Google Gemini + pgvector
    • Structured knowledge extraction to Neo4j
    • Multi-hop reasoning through graph relationships
    • Automatic entity and relationship mapping
    • Query decomposition for complex questions
  • Smart Query Processing

    • Automatic query decomposition for complex questions
    • Hybrid vector + graph-based retrieval
    • Retrieval quality evaluation and improvement
    • Context-aware response synthesis
    • Conversation memory with graph context
  • Chat & Session Management

    • Persistent chat sessions with history
    • Context-aware follow-up questions
    • Session renaming and management
    • Message tracking with context preservation
    • Multi-user support with isolation
  • Security & Authentication

    • JWT-based authentication
    • Secure password hashing with bcrypt
    • Role-based access control
    • Per-user data isolation
    • Document access verification
  • Storage & Infrastructure

    • S3-compatible object storage
    • PostgreSQL for structured data
    • Neo4j for graph relationships
    • Concurrent file operations
    • Efficient batch processing

🏗️ Architecture

🚀 Quick Start

Prerequisites

  • Python 3.8+
  • PostgreSQL 14+ with pgvector extension
  • Neo4j 5.0+
  • S3-compatible storage
  • Google Cloud API key for Gemini

Environment Variables

  1. Start the development server:

🔒 Security Features

  • JWT-based authentication with expiration
  • Bcrypt password hashing
  • Per-user document isolation
  • Access control verification
  • Secure file storage paths
  • Input validation and sanitization

📊 Monitoring & Logging

  • Structured logging with levels
  • Request/response tracking
  • Error handling and reporting
  • Performance metrics
  • Document processing status
  • Chat session analytics

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

📄 License

This project is licensed under the terms of the LICENSE file included in the repository.