Skip to content

Latest commit

 

History

History
91 lines (79 loc) · 3.68 KB

File metadata and controls

91 lines (79 loc) · 3.68 KB
name pygraphistry-core
description Core PyGraphistry workflow: auth, DataFrame-to-graph shaping, and first interactive plot. Use when asked to "register graphistry", "get started with pygraphistry", "plot my edges dataframe", "graphistry.register()", "bind src and dst columns", "make a hypergraph", "materialize nodes", or any first-graph / ETL-to-plot task. Also triggers on "first graphistry graph", "graphistry install", "api=3", or questions about graphistry auth credentials. Proactively suggest when the user is setting up graphistry for the first time or can't get a basic plot working from a DataFrame.

PyGraphistry Core

Doc routing (local + canonical)

  • First route with ../pygraphistry/references/pygraphistry-readthedocs-toc.md.
  • Use ../pygraphistry/references/pygraphistry-readthedocs-top-level.tsv for section-level shortcuts.
  • Only scan ../pygraphistry/references/pygraphistry-readthedocs-sitemap.xml when a needed page is missing.
  • Use one batched discovery read before deep-page reads; avoid cat * and serial micro-reads.
  • In user-facing answers, prefer canonical https://pygraphistry.readthedocs.io/en/latest/... links.

Quick workflow

  1. Register to a Graphistry server.
  2. Build graph from edges/nodes (or hypergraph from wide rows).
  3. Bind visual columns as needed.
  4. Plot and iterate.

Minimal baseline

import os
import graphistry

graphistry.register(
    api=3,
    username=os.environ.get('GRAPHISTRY_USERNAME'),
    password=os.environ.get('GRAPHISTRY_PASSWORD')
)

Auth variants (org + key flows)

# Organization-scoped login (SSO or user/pass org routing)
graphistry.register(api=3, org_name=os.environ['GRAPHISTRY_ORG_NAME'], idp_name=os.environ.get('GRAPHISTRY_IDP_NAME'))
# Service account / personal key flow
graphistry.register(
    api=3,
    personal_key_id=os.environ['GRAPHISTRY_PERSONAL_KEY_ID'],
    personal_key_secret=os.environ['GRAPHISTRY_PERSONAL_KEY_SECRET']
)
# edges_df: src,dst,... and nodes_df: id,...
edges_df['type'] = edges_df.get('type', 'transaction')
nodes_df['type'] = nodes_df.get('type', 'entity')
g = graphistry.edges(edges_df, 'src', 'dst').nodes(nodes_df, 'id')
g.plot()

Hypergraph baseline

# Build graph from multiple entity columns in one table
hg = graphistry.hypergraph(df, ['actor', 'event', 'location'])
hg['graph'].plot()

ETL shaping checklist

  • Normalize identifier columns before binding (src/dst/id type consistency, null handling).
  • Prefer a plain type column on both edges and nodes for legend-friendly defaults and consistent category encodings.
  • Deduplicate high-volume repeated rows before first upload.
  • Materialize nodes for node-centric steps:
g = graphistry.edges(edges_df, 'src', 'dst').materialize_nodes()

Practical checks

  • Confirm source/destination columns are non-null and correctly typed.
  • Materialize nodes if needed (g.materialize_nodes()) before node-centric operations.
  • Start with smaller slices for first render on large data.
  • For neighborhood expansion and pattern mining, always use .gfql([...]) or .gfql("MATCH ..."). The methods hop() and chain() are deprecated.
  • Keep credentials in environment variables only; do not hardcode usernames/passwords/tokens.

Canonical docs