Database Schema
Reference for the Code Insights SQLite database schema — projects, sessions, messages, and insights tables at ~/.code-insights/data.db.
The local SQLite database is stored at ~/.code-insights/data.db. It uses WAL mode for concurrent reads during CLI sync. All timestamps are ISO 8601 strings. Arrays and nested objects are stored as JSON strings.
Tables
projects
Stores project metadata. The project ID is derived from the git remote URL (stable across devices) or a hash of the working directory path.
| Column | Type | Default | Description |
|---|---|---|---|
id | TEXT PRIMARY KEY | — | Hash of git remote URL or directory path |
name | TEXT NOT NULL | — | Project display name |
path | TEXT NOT NULL | — | Working directory path |
git_remote_url | TEXT | NULL | Git remote URL (nullable) |
project_id_source | TEXT NOT NULL | 'path-hash' | How the ID was derived: 'git-remote' or 'path-hash' |
session_count | INTEGER NOT NULL | 0 | Number of synced sessions |
last_activity | TEXT NOT NULL | — | ISO 8601 timestamp of most recent session |
created_at | TEXT NOT NULL | datetime('now') | ISO 8601 timestamp |
updated_at | TEXT NOT NULL | datetime('now') | ISO 8601 timestamp |
total_input_tokens | INTEGER | 0 | Cumulative input tokens across all sessions |
total_output_tokens | INTEGER | 0 | Cumulative output tokens |
cache_creation_tokens | INTEGER | 0 | Cumulative cache write tokens |
cache_read_tokens | INTEGER | 0 | Cumulative cache read tokens |
estimated_cost_usd | REAL | 0 | Cumulative estimated cost in USD |
sessions
Stores session metadata with generated titles, character classification, token usage, and device info.
| Column | Type | Default | Description |
|---|---|---|---|
id | TEXT PRIMARY KEY | — | Session ID from source tool |
project_id | TEXT NOT NULL | — | Foreign key to projects.id |
project_name | TEXT NOT NULL | — | Denormalized project name |
project_path | TEXT NOT NULL | — | Denormalized project path |
git_remote_url | TEXT | NULL | Git remote URL at sync time |
summary | TEXT | NULL | LLM-generated session summary |
custom_title | TEXT | NULL | User-provided title |
generated_title | TEXT | NULL | Auto-generated title |
title_source | TEXT | NULL | Which tier generated the title |
session_character | TEXT | NULL | Classification: deep_focus, bug_hunt, feature_build, exploration, refactor, learning, quick_task |
started_at | TEXT NOT NULL | — | ISO 8601 session start time |
ended_at | TEXT NOT NULL | — | ISO 8601 session end time |
message_count | INTEGER NOT NULL | 0 | Total messages |
user_message_count | INTEGER NOT NULL | 0 | User messages only |
assistant_message_count | INTEGER NOT NULL | 0 | Assistant messages only |
tool_call_count | INTEGER NOT NULL | 0 | Total tool calls |
git_branch | TEXT | NULL | Git branch at session time |
claude_version | TEXT | NULL | Claude version string (Claude Code only) |
source_tool | TEXT NOT NULL | 'claude-code' | claude-code, cursor, codex-cli, copilot-cli, copilot |
device_id | TEXT | NULL | Stable anonymous device identifier |
device_hostname | TEXT | NULL | Machine hostname |
device_platform | TEXT | NULL | darwin, linux, win32 |
synced_at | TEXT NOT NULL | datetime('now') | ISO 8601 sync timestamp |
total_input_tokens | INTEGER | NULL | Total input tokens for session |
total_output_tokens | INTEGER | NULL | Total output tokens |
cache_creation_tokens | INTEGER | NULL | Cache write tokens |
cache_read_tokens | INTEGER | NULL | Cache read tokens |
estimated_cost_usd | REAL | NULL | Estimated cost in USD |
models_used | TEXT | NULL | JSON array of all model IDs used |
primary_model | TEXT | NULL | Most-used model ID |
usage_source | TEXT | NULL | How usage was calculated |
deleted_at | TEXT | NULL | ISO 8601 soft-delete timestamp (V5); NULL = visible |
compact_count | INTEGER NOT NULL | 0 | Number of manual context compactions in this session (V6) |
auto_compact_count | INTEGER NOT NULL | 0 | Number of auto-triggered context compactions (V6) |
slash_commands | TEXT NOT NULL | '[]' | JSON array of slash commands used in this session (V6) |
Indexes: project_id, started_at DESC, source_tool, deleted_at
messages
Stores full message content for all sessions.
| Column | Type | Default | Description |
|---|---|---|---|
id | TEXT PRIMARY KEY | — | Message ID |
session_id | TEXT NOT NULL | — | Foreign key to sessions.id |
type | TEXT NOT NULL | — | user or assistant |
content | TEXT NOT NULL | '' | Full message content |
thinking | TEXT | NULL | Extended thinking content (JSON) |
tool_calls | TEXT | NULL | JSON array of tool calls |
tool_results | TEXT | NULL | JSON array of tool results |
usage | TEXT | NULL | JSON token usage for this message |
timestamp | TEXT NOT NULL | — | ISO 8601 message timestamp |
parent_id | TEXT | NULL | Parent message ID |
Indexes: session_id, (session_id, timestamp ASC)
insights
Stores LLM-generated insights. Written by the embedded dashboard server, not the CLI.
| Column | Type | Default | Description |
|---|---|---|---|
id | TEXT PRIMARY KEY | — | UUID |
session_id | TEXT NOT NULL | — | Foreign key to sessions.id |
project_id | TEXT NOT NULL | — | Foreign key to projects.id |
project_name | TEXT NOT NULL | — | Denormalized project name |
type | TEXT NOT NULL | — | summary, decision, learning, technique, prompt_quality |
title | TEXT NOT NULL | — | Insight title |
content | TEXT NOT NULL | — | Full insight text |
summary | TEXT NOT NULL | — | One-line summary |
bullets | TEXT | NULL | JSON array of bullet points |
confidence | REAL NOT NULL | — | Confidence score 0.0–1.0 |
source | TEXT NOT NULL | 'llm' | Always llm |
metadata | TEXT | NULL | JSON metadata blob |
timestamp | TEXT NOT NULL | — | ISO 8601 insight timestamp |
created_at | TEXT NOT NULL | datetime('now') | ISO 8601 creation timestamp |
scope | TEXT NOT NULL | 'session' | Always session currently |
analysis_version | TEXT NOT NULL | '1.0.0' | Analysis pipeline version |
linked_insight_ids | TEXT | NULL | JSON array of related insight IDs |
Indexes: session_id, project_id, type, timestamp DESC
usage_stats
A singleton row (id = 1) tracking cumulative totals across all sessions. Updated by the CLI on each sync.
| Column | Type | Default | Description |
|---|---|---|---|
id | INTEGER PRIMARY KEY | — | Always 1 (singleton) |
total_input_tokens | INTEGER | 0 | Cumulative input tokens |
total_output_tokens | INTEGER | 0 | Cumulative output tokens |
cache_creation_tokens | INTEGER | 0 | Cumulative cache write tokens |
cache_read_tokens | INTEGER | 0 | Cumulative cache read tokens |
estimated_cost_usd | REAL | 0 | Cumulative cost in USD |
sessions_with_usage | INTEGER | 0 | Sessions that have token data |
last_updated_at | TEXT | datetime('now') | ISO 8601 last update time |
session_facets
Stores per-session friction and effective-pattern analysis extracted by the Reflect pipeline. Added in V3.
| Column | Type | Default | Description |
|---|---|---|---|
session_id | TEXT PRIMARY KEY | — | Foreign key to sessions.id |
outcome_satisfaction | TEXT NOT NULL | — | positive, negative, or mixed |
workflow_pattern | TEXT | NULL | Dominant workflow pattern for the session |
had_course_correction | INTEGER NOT NULL | 0 | 1 if the session involved a significant course correction |
course_correction_reason | TEXT | NULL | Brief description of the correction |
iteration_count | INTEGER NOT NULL | 0 | Number of distinct iteration cycles |
friction_points | TEXT | NULL | JSON array of friction point objects |
effective_patterns | TEXT | NULL | JSON array of effective pattern objects |
extracted_at | TEXT NOT NULL | datetime('now') | ISO 8601 extraction timestamp |
analysis_version | TEXT NOT NULL | '1.0.0' | Analysis pipeline version |
Indexes: outcome_satisfaction, workflow_pattern
reflect_snapshots
Caches LLM-generated weekly synthesis results. Added in V4.
| Column | Type | Default | Description |
|---|---|---|---|
period | TEXT NOT NULL | — | ISO week string (e.g., 2026-W10) or custom period key |
project_id | TEXT NOT NULL | '__all__' | Project scope; __all__ = all projects |
results_json | TEXT NOT NULL | — | JSON blob of the full synthesis result |
generated_at | TEXT NOT NULL | — | ISO 8601 generation timestamp |
window_start | TEXT | NULL | ISO 8601 start of the analysis window |
window_end | TEXT NOT NULL | — | ISO 8601 end of the analysis window |
session_count | INTEGER NOT NULL | — | Number of sessions included in this snapshot |
facet_count | INTEGER NOT NULL | — | Number of facets included in this snapshot |
Primary key: (period, project_id)
schema_version
Tracks applied migration versions. Checked on every startup to determine which migrations to run.
| Column | Type | Default | Description |
|---|---|---|---|
version | INTEGER PRIMARY KEY | — | Migration version number (1–7) |
applied_at | TEXT NOT NULL | datetime('now') | ISO 8601 timestamp when the migration was applied |
analysis_usage
Tracks LLM API usage and cost per analysis call, keyed by (session_id, analysis_type). Added in V7. Written by the dashboard server during session analysis.
| Column | Type | Default | Description |
|---|---|---|---|
session_id | TEXT NOT NULL | — | Foreign key to sessions.id |
analysis_type | TEXT NOT NULL | — | session, prompt_quality, or facet |
provider | TEXT NOT NULL | — | LLM provider name (e.g., anthropic, openai) |
model | TEXT NOT NULL | — | Model ID used for this analysis |
input_tokens | INTEGER NOT NULL | 0 | Input tokens billed |
output_tokens | INTEGER NOT NULL | 0 | Output tokens billed |
cache_creation_tokens | INTEGER NOT NULL | 0 | Cache write tokens (Anthropic prompt caching) |
cache_read_tokens | INTEGER NOT NULL | 0 | Cache read tokens (Anthropic prompt caching) |
estimated_cost_usd | REAL NOT NULL | 0 | Estimated cost in USD |
duration_ms | INTEGER | NULL | Analysis wall-clock duration in milliseconds |
chunk_count | INTEGER NOT NULL | 1 | Number of message chunks analyzed |
analyzed_at | TEXT NOT NULL | datetime('now') | ISO 8601 analysis timestamp |
Primary key: (session_id, analysis_type) — upserted on re-analysis
Index: analyzed_at DESC
Accessing the Database
You can query the database directly with any SQLite client:
sqlite3 ~/.code-insights/data.db
.tables
SELECT generated_title, session_character, estimated_cost_usd FROM sessions ORDER BY started_at DESC LIMIT 10;
SELECT SUM(estimated_cost_usd) FROM sessions;
SELECT source_tool, COUNT(*) FROM sessions GROUP BY source_tool;