Now in private beta · Desktop for macOS & Windows

Raw spreadsheets in.
Trained models out.

Cogentic is an AI‑guided desktop app that profiles your data, fixes the messy parts, and trains, evaluates, and visualizes models on top of it — all in one workflow. Built for the people who want to ship models, not fight CSVs.

CSV · XLSX · Parquet Runs locally on your machine OpenAI · Anthropic · Gemini
CSV XLSX Parquet DuckDB PyArrow OpenAI Anthropic Gemini pandas scikit‑learn CatBoost PyTorch
How it works

Upload, clean, train — in one workflow.

No SQL. No notebooks. No “go ask the data team.” Drop in a file, walk the wizard, ship a model.

01

Upload

Drop in CSV, XLSX, or Parquet. Cogentic ingests it as a bronze snapshot you can return to at any time.

02

Profile & clean

Auto‑profile reveals nulls, types, distincts, and ranges. The wizard walks you through schema, ranges, replacements, derivations, and missing‑value strategy — with AI suggestions every step of the way.

03

Train & evaluate

Pick a target, let Cogentic propose an algorithm (or choose one yourself), train, and inspect residuals, distributions, and clusters — all in‑app.

Inside the app

A complete ML workspace, end to end.

From a raw table on disk to a trained model with diagnostics — without leaving Cogentic.

Cogentic Trained Models view showing R² score, RMSE, MAE, residuals vs predicted scatter plot, and residual distribution histogram
Trained Models · Visuals

Inspect every model with first‑class diagnostics.

R², RMSE, MAE, CV stats, residual scatter and distribution — one click after training. Catch bias and skew before you ship.

ML Plan Preview screen with target column rul, 80/20 train-test split slider, algorithm selection of Auto Rule-based, Auto LLM, or Manual, and a CatBoostRegressor plan
ML Plan

AI picks the algorithm, you review the plan.

Pick a target column, drag the train/test split, and let rule‑based or LLM auto‑configure pick the algorithm. Review the plan before you train a single epoch.

Training Complete card showing R² 0.9765, RMSE 10.4031, MAE 8.1047, CV Mean -10.7476, train rows 16504, test rows 4127, train time 1.0568s, with download experiment bundle button
Training

Train, score, and bundle for export.

Live training progress, full metric scorecards, and a one‑click experiment bundle — model.pkl, schema, plan, and metrics in a single archive.

PCA scatter plot showing two clusters along PC1 (52.5% variance) and PC2 (12.1% variance) with Cluster 0 in blue and Cluster 1 in orange
PCA & Clustering

See structure before you model it.

Auto PCA + KMeans surfaces the natural groupings hiding in your features — with silhouette‑based cluster selection done for you.

Cluster Insights panel showing Cluster 0 with 1386 points 69 percent and average target 138.4156, and Cluster 1 with 614 points 31 percent and average target 41.7899, with feature importances for s11, s12, s4
Cluster Insights

Know what each cluster means.

For every cluster, Cogentic surfaces size, average target, and the features driving it — turning a scatter plot into a story you can act on.

Trained Models page showing model cards on the left, a model details pane with Dataset ID 44830d74cb9a, target rul, completed status, and full quality metrics
Trained Models · Data

Every run, traceable. Every dataset, versioned.

Dataset IDs, query IDs, target columns, and quality metrics travel with each model. Re‑train next quarter on the same plan, or compare models on the leaderboard side by side.

What's inside

Every messy data chore — handled.

A focused toolkit for the unglamorous 80% of ML work: making the data not embarrassing.

Range filtering with live row counts

Set min/max bounds on any numeric column and watch row counts update in real time. Know exactly how aggressive you're being before a single row gets dropped.

DuckDB · PyArrow

LLM‑powered cleaning plans

Plug in OpenAI, Anthropic, or Gemini and let a model read your profile and propose atomic, reviewable changes — “drop the <0 altitudes,” “impute fuel_burn with the median,” “coerce timestamps to UTC.” Every suggestion is yours to accept, reject, or tweak.

OpenAI · Claude · Gemini

Smart missing‑value handling

Drop rows, impute mean / median / mode, or fill with a constant — picked per column, previewed before commit.

Auto ML profiling

One pass produces null %, distinct counts, min/max, sample values, plus a clean schema and context JSON your training pipeline can read straight off disk.

Visualization suggestions

Histograms, correlation matrices, scatter plots, residuals, PCA — surfaced for the columns and models most likely to matter.

Reproducible & versioned

Every run produces a JSONL operation log, DQ report, and versioned silver table. Re‑run the exact pipeline next quarter — same plan, fresh data.

Built‑in inference

Test your trained model on new rows without leaving Cogentic. Paste a row, get a prediction — with feature attributions and confidence on the side.

Leaderboard & comparison

Every training run lands on a leaderboard you can sort and compare. R², RMSE, train time, dataset version — pick the winner with the receipts.

Output you can hand straight to a training script.

Every export bundles a typed silver Parquet, an ML profile JSON, a DQ report, and a JSONL operation log — ready to drop into pandas, scikit‑learn, PyTorch, or any pipeline that reads off disk.

  • Typed silver Parquet (one source of truth)
  • ML profile JSON with per‑column stats and target inference
  • DQ report — row deltas, coercions, warnings
  • Experiment bundle — model.pkl + plan + metrics

Clean the data. Train the model. Ship.

Early access is open. Be the first to take Cogentic for a spin.

Get Early Access