diff --git a/site/content/3.13/data-science/_index.md b/site/content/3.13/data-science/_index.md index c543cebc6e..b3158d8fc6 100644 --- a/site/content/3.13/data-science/_index.md +++ b/site/content/3.13/data-science/_index.md @@ -3,32 +3,81 @@ title: Data Science menuTitle: Data Science weight: 115 description: >- - ArangoDB lets you apply analytics and machine learning to graph data at scale + ArangoDB's set of tools and technologies enables analytics, machine learning, + and GenAI applications powered by graph data aliases: - data-science/overview --- +ArangoDB provides a wide range of functionality that can be utilized for +data science applications. The core database system includes multi-model storage +of information with scalable graph and information retrieval capabilities that +you can directly use for your research and product development. + +ArangoDB also offers a dedicated Data Science Suite, using the database core +as the foundation for higher-level features. Whether you want to turbocharge +generative AI applications with a GraphRAG solution or apply analytics and +machine learning to graph data at scale, ArangoDB covers these needs. + + + +## Data Science Suite + +The Data Science Suite (DSS) is comprised of three major components: + +- [**HybridRAG**](#hybridrag): A complete solution for extracting entities + from text files to create a knowledge graph that you can then query with a + natural language interface. +- [**GraphML**](#graphml): Apply machine learning to graphs for link prediction, + classification, and similar tasks. +- [**Graph Analytics**](#graph-analytics): Run graph algorithms such as PageRank + on dedicated compute resources. + +Each component has an intuitive graphical user interface integrated into the +ArangoDB Platform web interface, guiding you through the process. + + +Alongside these components, you also get the following additional features: -ArangoDB, as the foundation for GraphML, comes with the following key features: +- **Graph visualizer**: A web-based tool for exploring your graph data with an + intuitive interface and sophisticated querying capabilities. +- **Jupyter notebooks**: Run a Jupyter kernel in the platform for hosting + interactive notebooks for experimentation and development of applications + that use ArangoDB as their backend. +- **MLflow integration**: Built-in support for the popular management tool for + the machine learning lifecycle. +- **Adapters**: Use ArangoDB together with cuGraph, NetworkX, and other tools. +- **Application Programming Interfaces**: Use the underlying APIs of the + Data Science Suite services and build your own integrations. -- **Scalable**: designed to support true scalability with high performance for +## From graph to AI + +This section classifies the complexity of the queries you can answer with +ArangoDB and gives you an overview of the respective feature. + +It starts with running a simple query that shows what is the path that goes from +one node to another, continues with more complex tasks like graph classification, +link prediction, and node classification, and ends with generative AI solutions +powered by graph relationships and vector embeddings. + +### Foundational features + +ArangoDB comes with the following key features: + +- **Scalable**: Designed to support true scalability with high performance for enterprise use cases. -- **Simple Ingestion**: easy integration in existing data infrastructure with +- **Simple Ingestion**: Easy integration in existing data infrastructure with connectors to all leading data processing and data ecosystems. -- **Source-Available**: extensibility and community. -- **NLP Support**: built-in text processing, search, and similarity ranking. - -![ArangoDB Machine Learning Architecture](../../images/machine-learning-architecture.png) +- **Source-Available**: Extensibility and community. +- **NLP Support**: Built-in text processing, search, and similarity ranking. -## Graph Analytics vs. GraphML + -This section classifies the complexity of the queries we can answer - -like running a simple query that shows what is the path that goes from one node -to another, or more complex tasks like node classification, -link prediction, and graph classification. +![ArangoDB Machine Learning Architecture](../../images/machine-learning-architecture.png) ### Graph Queries @@ -69,65 +118,24 @@ GraphML can answer questions like: ![Graph ML](../../images/graph-ml.png) For ArangoDB's enterprise-ready, graph-powered machine learning offering, -see [ArangoGraphML](arangographml/_index.md). - -## Use Cases - -This section contains an overview of different use cases where Graph Analytics -and GraphML can be applied. - -### GraphML - -GraphML capabilities of using more data outperform conventional deep learning -methods and **solve high-computational complexity graph problems**, such as: -- Drug discovery, repurposing, and predicting adverse effects. -- Personalized product/service recommendation. -- Supply chain and logistics. - -With GraphML, you can also **predict relationships and structures**, such as: -- Predict molecules for treating diseases (precision medicine). -- Predict fraudulent behavior, credit risk, purchase of product or services. -- Predict relationships among customers, accounts. - -ArangoDB uses well-known GraphML frameworks like -[Deep Graph Library](https://www.dgl.ai) -and [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/) -and connects to these external machine learning libraries. When coupled to -ArangoDB, you are essentially integrating them with your graph dataset. - -## Example: ArangoFlix - -ArangoFlix is a complete movie recommendation application that predicts missing -links between a user and the movies they have not watched yet. - -This [interactive tutorial](https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/Integrate_ArangoDB_with_PyG.ipynb) -demonstrates how to integrate ArangoDB with PyTorch Geometric to -build recommendation systems using Graph Neural Networks (GNNs). - -The full ArangoFlix demo website is accessible from the ArangoGraph Insights Platform, -the managed cloud for ArangoDB. You can open the demo website that connects to -your running database from the **Examples** tab of your deployment. +see [ArangoGraphML](graphml/_index.md). -{{< tip >}} -You can try out the ArangoGraph Insights Platform free of charge for 14 days. -Sign up at [dashboard.arangodb.cloud](https://dashboard.arangodb.cloud/home?utm_source=docs&utm_medium=cluster_pages&utm_campaign=docs_traffic). -{{< /tip >}} +### HybridRAG -The ArangoFlix demo uses five different recommendation methods: -- Content-Based using AQL -- Collaborative Filtering using AQL -- Content-Based using ML -- Matrix Factorization -- Graph Neural Networks +HybridRAG is ArangoDB's turn-key solution to turn your organization's data into +a knowledge graph and let everyone utilize the knowledge by asking questions in +natural language. -![ArangoFlix demo](../../images/data-science-arangoflix.png) +HybridRAG combines vector search for retrieving related text snippets +with graph-based retrieval augmented generation (GraphRAG) for context expansion +and relationship discovery. This lets a large language model (LLM) generate +answers that are accurate, context-aware, and chronologically structured. +This approach combats the common problem of hallucination. -The ArangoFlix website not only offers an example of how the user recommendations might -look like in real life, but it also provides information on a recommendation method, -an AQL query, a custom graph visualization for each movie, and more. +To learn more, see the [HybridRAG](hybrid-rag.md) documentation. ## Sample datasets If you want to try out ArangoDB's data science features, you may use the -[`arango_datasets` Python package](../components/tools/arango-datasets.md) +[`arango-datasets` Python package](../components/tools/arango-datasets.md) to load sample datasets into a deployment. diff --git a/site/content/3.13/data-science/adapters/_index.md b/site/content/3.13/data-science/adapters/_index.md index 0aa3efea24..75d8c4558b 100644 --- a/site/content/3.13/data-science/adapters/_index.md +++ b/site/content/3.13/data-science/adapters/_index.md @@ -1,7 +1,7 @@ --- title: Adapters menuTitle: Adapters -weight: 140 +weight: 50 description: >- ArangoDB offers multiple adapters that enable seamless integration with data science tools diff --git a/site/content/3.13/data-science/arangograph-notebooks.md b/site/content/3.13/data-science/arangograph-notebooks.md index 34ca9529be..d6b2a94930 100644 --- a/site/content/3.13/data-science/arangograph-notebooks.md +++ b/site/content/3.13/data-science/arangograph-notebooks.md @@ -1,7 +1,7 @@ --- title: ArangoGraph Notebooks menuTitle: ArangoGraph Notebooks -weight: 130 +weight: 40 description: >- Colocated Jupyter Notebooks within the ArangoGraph Insights Platform --- diff --git a/site/content/3.13/data-science/graph-analytics.md b/site/content/3.13/data-science/graph-analytics.md index 18df401e84..8983a2124e 100644 --- a/site/content/3.13/data-science/graph-analytics.md +++ b/site/content/3.13/data-science/graph-analytics.md @@ -1,7 +1,7 @@ --- title: Graph Analytics menuTitle: Graph Analytics -weight: 123 +weight: 30 description: | ArangoGraph offers Graph Analytics Engines to run graph algorithms on your data separately from your ArangoDB deployments diff --git a/site/content/3.13/data-science/arangographml/_index.md b/site/content/3.13/data-science/graphml/_index.md similarity index 79% rename from site/content/3.13/data-science/arangographml/_index.md rename to site/content/3.13/data-science/graphml/_index.md index 2d5d3324de..54d0bed905 100644 --- a/site/content/3.13/data-science/arangographml/_index.md +++ b/site/content/3.13/data-science/graphml/_index.md @@ -1,11 +1,11 @@ --- -title: ArangoGraphML -menuTitle: ArangoGraphML -weight: 125 +title: ArangoGraphML # Rename as well? +menuTitle: GraphML +weight: 20 description: >- Enterprise-ready, graph-powered machine learning as a cloud service or self-managed aliases: - - graphml + - arangographml --- Traditional Machine Learning (ML) overlooks the connections and relationships between data points, which is where graph machine learning excels. However, @@ -13,6 +13,56 @@ accessibility to GraphML has been limited to sizable enterprises equipped with specialized teams of data scientists. ArangoGraphML simplifies the utilization of GraphML, enabling a broader range of personas to extract profound insights from their data. +## Use cases + +GraphML capabilities of using more data outperform conventional deep learning +methods and **solve high-computational complexity graph problems**, such as: +- Drug discovery, repurposing, and predicting adverse effects. +- Personalized product/service recommendation. +- Supply chain and logistics. + +With GraphML, you can also **predict relationships and structures**, such as: +- Predict molecules for treating diseases (precision medicine). +- Predict fraudulent behavior, credit risk, purchase of product or services. +- Predict relationships among customers, accounts. + +ArangoDB uses well-known GraphML frameworks like +[Deep Graph Library](https://www.dgl.ai) +and [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/) +and connects to these external machine learning libraries. When coupled to +ArangoDB, you are essentially integrating them with your graph dataset. + +#### Example: ArangoFlix + +ArangoFlix is a complete movie recommendation application that predicts missing +links between a user and the movies they have not watched yet. + +This [interactive tutorial](https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/Integrate_ArangoDB_with_PyG.ipynb) +demonstrates how to integrate ArangoDB with PyTorch Geometric to +build recommendation systems using Graph Neural Networks (GNNs). + +The full ArangoFlix demo website is accessible from the ArangoGraph Insights Platform, +the managed cloud for ArangoDB. You can open the demo website that connects to +your running database from the **Examples** tab of your deployment. + +{{< tip >}} +You can try out the ArangoGraph Insights Platform free of charge for 14 days. +Sign up at [dashboard.arangodb.cloud](https://dashboard.arangodb.cloud/home?utm_source=docs&utm_medium=cluster_pages&utm_campaign=docs_traffic). +{{< /tip >}} + +The ArangoFlix demo uses five different recommendation methods: +- Content-Based using AQL +- Collaborative Filtering using AQL +- Content-Based using ML +- Matrix Factorization +- Graph Neural Networks + +![ArangoFlix demo](../../../images/data-science-arangoflix.png) + +The ArangoFlix website not only offers an example of how the user recommendations might +look like in real life, but it also provides information on a recommendation method, +an AQL query, a custom graph visualization for each movie, and more. + ## How GraphML works Graph machine learning leverages the inherent structure of graph data, where diff --git a/site/content/3.13/data-science/arangographml/deploy.md b/site/content/3.13/data-science/graphml/deploy.md similarity index 98% rename from site/content/3.13/data-science/arangographml/deploy.md rename to site/content/3.13/data-science/graphml/deploy.md index 0d62cb12f6..dccca0b59d 100644 --- a/site/content/3.13/data-science/arangographml/deploy.md +++ b/site/content/3.13/data-science/graphml/deploy.md @@ -5,6 +5,8 @@ weight: 5 description: >- You can deploy ArangoGraphML in your own Kubernetes cluster or use the managed cloud service that comes with a ready-to-go, pre-configured environment +aliases: + - ../arangographml/deploy --- ## Managed cloud service versus self-managed diff --git a/site/content/3.13/data-science/arangographml/getting-started.md b/site/content/3.13/data-science/graphml/getting-started.md similarity index 99% rename from site/content/3.13/data-science/arangographml/getting-started.md rename to site/content/3.13/data-science/graphml/getting-started.md index 6bd614167e..51b756d81d 100644 --- a/site/content/3.13/data-science/arangographml/getting-started.md +++ b/site/content/3.13/data-science/graphml/getting-started.md @@ -5,7 +5,8 @@ weight: 10 description: >- How to control all resources inside ArangoGraphML in a scriptable manner aliases: - - getting-started-with-arangographml + - ../arangographml/getting-started-with-arangographml + - ../arangographml/getting-started --- ArangoGraphML provides an easy-to-use & scalable interface to run Graph Machine Learning on ArangoDB Data. Since all of the orchestration and ML logic is managed by ArangoGraph, all that is typically required are JSON specifications outlining individual processes to solve an ML Task. If you are using the self-managed solution, additional configurations may be required. diff --git a/site/content/3.13/data-science/llm-knowledge-graphs.md b/site/content/3.13/data-science/hybrid-rag.md similarity index 73% rename from site/content/3.13/data-science/llm-knowledge-graphs.md rename to site/content/3.13/data-science/hybrid-rag.md index 80d8be9666..c11cdaa8eb 100644 --- a/site/content/3.13/data-science/llm-knowledge-graphs.md +++ b/site/content/3.13/data-science/hybrid-rag.md @@ -1,9 +1,13 @@ --- -title: Large Language Models (LLMs) and Knowledge Graphs -menuTitle: Large Language Models and Knowledge Graphs -weight: 133 +title: Graph-powered HybridRAG +menuTitle: HybridRAG +weight: 10 description: >- - Integrate large language models (LLMs) with knowledge graphs using ArangoDB + ArangoDB's HybridRAG combines graph-based retrieval augmented generation + (GraphRAG) with Large Language Models (LLMs) for turbocharged Gen AI solutions +aliases: + llm-knowledge-graphs +# TODO: Repurpose for GenAI --- Large language models (LLMs) and knowledge graphs are two prominent and contrasting concepts, each possessing unique characteristics and functionalities @@ -25,7 +29,17 @@ ArangoDB's unique capabilities and flexible integration of knowledge graphs and LLMs provide a powerful and efficient solution for anyone seeking to extract valuable insights from diverse datasets. -## Knowledge Graphs +The HybridRAG component of the Data Science Suite brings all the capabilities +together with an easy-to-use interface so you can make the knowledge accessible +to your organization. + +## HybridRAG + +ArangoDB's HybridRAG solution democratizes the creation and usage of knowledge +graphs with a unique combination of vector search, graphs, and LLMs in a +single product. + +### Knowledge Graphs A knowledge graph can be thought of as a dynamic and interconnected network of real-world entities and the intricate relationships that exist between them. @@ -48,7 +62,29 @@ the following tasks: ![ArangoDB Knowledge Graphs and LLMs](../../images/ArangoDB-knowledge-graphs-meets-llms.png) -## ArangoDB and LangChain +### Examples + +### Services + +#### Service A + +#### Service B + +### Interfaces + +{{< tabs "interfaces" >}} + +{{< tab "Web interface" >}} +1. In the Platform UI, ... +{{< /tab >}} + +{{< tab "cURL" >}} +curl http://localhost:8529/gen-ai/ +{{< /tab >}} + +{{< /tabs >}} + +#### ArangoDB and LangChain [LangChain](https://www.langchain.com/) is a framework for developing applications powered by language models. @@ -62,7 +98,7 @@ data seamlessly via natural language, eliminating the need for query language design. By using LLM chat models such as OpenAI’s ChatGPT, you can "speak" to your data instead of querying it. -### Get started with ArangoDB QA chain +##### Get started with ArangoDB QA chain The [ArangoDB QA chain notebook](https://langchain-langchain.vercel.app/docs/use_cases/more/graph/graph_arangodb_qa.html) shows how to use LLMs to provide a natural language interface to an ArangoDB @@ -70,4 +106,4 @@ instance. Run the notebook directly in [Google Colab](https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/Langchain.ipynb). -See also other [machine learning interactive tutorials](https://github.com/arangodb/interactive_tutorials#machine-learning). \ No newline at end of file +See also other [machine learning interactive tutorials](https://github.com/arangodb/interactive_tutorials#machine-learning).