Skip to content
Code Insights
Documentation

Database Schema

Reference for the Code Insights SQLite database schema — projects, sessions, messages, and insights tables at ~/.code-insights/data.db.

The local SQLite database is stored at ~/.code-insights/data.db. It uses WAL mode for concurrent reads during CLI sync. All timestamps are ISO 8601 strings. Arrays and nested objects are stored as JSON strings.

Tables

projects

Stores project metadata. The project ID is derived from the git remote URL (stable across devices) or a hash of the working directory path.

ColumnTypeDefaultDescription
idTEXT PRIMARY KEYHash of git remote URL or directory path
nameTEXT NOT NULLProject display name
pathTEXT NOT NULLWorking directory path
git_remote_urlTEXTNULLGit remote URL (nullable)
project_id_sourceTEXT NOT NULL'path-hash'How the ID was derived: 'git-remote' or 'path-hash'
session_countINTEGER NOT NULL0Number of synced sessions
last_activityTEXT NOT NULLISO 8601 timestamp of most recent session
created_atTEXT NOT NULLdatetime('now')ISO 8601 timestamp
updated_atTEXT NOT NULLdatetime('now')ISO 8601 timestamp
total_input_tokensINTEGER0Cumulative input tokens across all sessions
total_output_tokensINTEGER0Cumulative output tokens
cache_creation_tokensINTEGER0Cumulative cache write tokens
cache_read_tokensINTEGER0Cumulative cache read tokens
estimated_cost_usdREAL0Cumulative estimated cost in USD

sessions

Stores session metadata with generated titles, character classification, token usage, and device info.

ColumnTypeDefaultDescription
idTEXT PRIMARY KEYSession ID from source tool
project_idTEXT NOT NULLForeign key to projects.id
project_nameTEXT NOT NULLDenormalized project name
project_pathTEXT NOT NULLDenormalized project path
git_remote_urlTEXTNULLGit remote URL at sync time
summaryTEXTNULLLLM-generated session summary
custom_titleTEXTNULLUser-provided title
generated_titleTEXTNULLAuto-generated title
title_sourceTEXTNULLWhich tier generated the title
session_characterTEXTNULLClassification: deep_focus, bug_hunt, feature_build, exploration, refactor, learning, quick_task
started_atTEXT NOT NULLISO 8601 session start time
ended_atTEXT NOT NULLISO 8601 session end time
message_countINTEGER NOT NULL0Total messages
user_message_countINTEGER NOT NULL0User messages only
assistant_message_countINTEGER NOT NULL0Assistant messages only
tool_call_countINTEGER NOT NULL0Total tool calls
git_branchTEXTNULLGit branch at session time
claude_versionTEXTNULLClaude version string (Claude Code only)
source_toolTEXT NOT NULL'claude-code'claude-code, cursor, codex-cli, copilot-cli, copilot
device_idTEXTNULLStable anonymous device identifier
device_hostnameTEXTNULLMachine hostname
device_platformTEXTNULLdarwin, linux, win32
synced_atTEXT NOT NULLdatetime('now')ISO 8601 sync timestamp
total_input_tokensINTEGERNULLTotal input tokens for session
total_output_tokensINTEGERNULLTotal output tokens
cache_creation_tokensINTEGERNULLCache write tokens
cache_read_tokensINTEGERNULLCache read tokens
estimated_cost_usdREALNULLEstimated cost in USD
models_usedTEXTNULLJSON array of all model IDs used
primary_modelTEXTNULLMost-used model ID
usage_sourceTEXTNULLHow usage was calculated
deleted_atTEXTNULLISO 8601 soft-delete timestamp (V5); NULL = visible
compact_countINTEGER NOT NULL0Number of manual context compactions in this session (V6)
auto_compact_countINTEGER NOT NULL0Number of auto-triggered context compactions (V6)
slash_commandsTEXT NOT NULL'[]'JSON array of slash commands used in this session (V6)

Indexes: project_id, started_at DESC, source_tool, deleted_at

messages

Stores full message content for all sessions.

ColumnTypeDefaultDescription
idTEXT PRIMARY KEYMessage ID
session_idTEXT NOT NULLForeign key to sessions.id
typeTEXT NOT NULLuser or assistant
contentTEXT NOT NULL''Full message content
thinkingTEXTNULLExtended thinking content (JSON)
tool_callsTEXTNULLJSON array of tool calls
tool_resultsTEXTNULLJSON array of tool results
usageTEXTNULLJSON token usage for this message
timestampTEXT NOT NULLISO 8601 message timestamp
parent_idTEXTNULLParent message ID

Indexes: session_id, (session_id, timestamp ASC)

insights

Stores LLM-generated insights. Written by the embedded dashboard server, not the CLI.

ColumnTypeDefaultDescription
idTEXT PRIMARY KEYUUID
session_idTEXT NOT NULLForeign key to sessions.id
project_idTEXT NOT NULLForeign key to projects.id
project_nameTEXT NOT NULLDenormalized project name
typeTEXT NOT NULLsummary, decision, learning, technique, prompt_quality
titleTEXT NOT NULLInsight title
contentTEXT NOT NULLFull insight text
summaryTEXT NOT NULLOne-line summary
bulletsTEXTNULLJSON array of bullet points
confidenceREAL NOT NULLConfidence score 0.0–1.0
sourceTEXT NOT NULL'llm'Always llm
metadataTEXTNULLJSON metadata blob
timestampTEXT NOT NULLISO 8601 insight timestamp
created_atTEXT NOT NULLdatetime('now')ISO 8601 creation timestamp
scopeTEXT NOT NULL'session'Always session currently
analysis_versionTEXT NOT NULL'1.0.0'Analysis pipeline version
linked_insight_idsTEXTNULLJSON array of related insight IDs

Indexes: session_id, project_id, type, timestamp DESC

usage_stats

A singleton row (id = 1) tracking cumulative totals across all sessions. Updated by the CLI on each sync.

ColumnTypeDefaultDescription
idINTEGER PRIMARY KEYAlways 1 (singleton)
total_input_tokensINTEGER0Cumulative input tokens
total_output_tokensINTEGER0Cumulative output tokens
cache_creation_tokensINTEGER0Cumulative cache write tokens
cache_read_tokensINTEGER0Cumulative cache read tokens
estimated_cost_usdREAL0Cumulative cost in USD
sessions_with_usageINTEGER0Sessions that have token data
last_updated_atTEXTdatetime('now')ISO 8601 last update time

session_facets

Stores per-session friction and effective-pattern analysis extracted by the Reflect pipeline. Added in V3.

ColumnTypeDefaultDescription
session_idTEXT PRIMARY KEYForeign key to sessions.id
outcome_satisfactionTEXT NOT NULLpositive, negative, or mixed
workflow_patternTEXTNULLDominant workflow pattern for the session
had_course_correctionINTEGER NOT NULL01 if the session involved a significant course correction
course_correction_reasonTEXTNULLBrief description of the correction
iteration_countINTEGER NOT NULL0Number of distinct iteration cycles
friction_pointsTEXTNULLJSON array of friction point objects
effective_patternsTEXTNULLJSON array of effective pattern objects
extracted_atTEXT NOT NULLdatetime('now')ISO 8601 extraction timestamp
analysis_versionTEXT NOT NULL'1.0.0'Analysis pipeline version

Indexes: outcome_satisfaction, workflow_pattern

reflect_snapshots

Caches LLM-generated weekly synthesis results. Added in V4.

ColumnTypeDefaultDescription
periodTEXT NOT NULLISO week string (e.g., 2026-W10) or custom period key
project_idTEXT NOT NULL'__all__'Project scope; __all__ = all projects
results_jsonTEXT NOT NULLJSON blob of the full synthesis result
generated_atTEXT NOT NULLISO 8601 generation timestamp
window_startTEXTNULLISO 8601 start of the analysis window
window_endTEXT NOT NULLISO 8601 end of the analysis window
session_countINTEGER NOT NULLNumber of sessions included in this snapshot
facet_countINTEGER NOT NULLNumber of facets included in this snapshot

Primary key: (period, project_id)

schema_version

Tracks applied migration versions. Checked on every startup to determine which migrations to run.

ColumnTypeDefaultDescription
versionINTEGER PRIMARY KEYMigration version number (1–7)
applied_atTEXT NOT NULLdatetime('now')ISO 8601 timestamp when the migration was applied

analysis_usage

Tracks LLM API usage and cost per analysis call, keyed by (session_id, analysis_type). Added in V7. Written by the dashboard server during session analysis.

ColumnTypeDefaultDescription
session_idTEXT NOT NULLForeign key to sessions.id
analysis_typeTEXT NOT NULLsession, prompt_quality, or facet
providerTEXT NOT NULLLLM provider name (e.g., anthropic, openai)
modelTEXT NOT NULLModel ID used for this analysis
input_tokensINTEGER NOT NULL0Input tokens billed
output_tokensINTEGER NOT NULL0Output tokens billed
cache_creation_tokensINTEGER NOT NULL0Cache write tokens (Anthropic prompt caching)
cache_read_tokensINTEGER NOT NULL0Cache read tokens (Anthropic prompt caching)
estimated_cost_usdREAL NOT NULL0Estimated cost in USD
duration_msINTEGERNULLAnalysis wall-clock duration in milliseconds
chunk_countINTEGER NOT NULL1Number of message chunks analyzed
analyzed_atTEXT NOT NULLdatetime('now')ISO 8601 analysis timestamp

Primary key: (session_id, analysis_type) — upserted on re-analysis

Index: analyzed_at DESC

Accessing the Database

You can query the database directly with any SQLite client:

sqlite3 ~/.code-insights/data.db
 
.tables
SELECT generated_title, session_character, estimated_cost_usd FROM sessions ORDER BY started_at DESC LIMIT 10;
SELECT SUM(estimated_cost_usd) FROM sessions;
SELECT source_tool, COUNT(*) FROM sessions GROUP BY source_tool;