Implementation Roadmap
Seven milestones delivered. The language server, RAG knowledge system, MCP server, web chat, documentation site, and compiler validation tooling are all operational. Fine-tuning research has identified the path forward: Qwen2.5-Coder-14B-Base with evaluation via the BBj compiler (compile@1). This chapter summarizes what exists and what comes next.
The preceding chapters describe the technical blueprints for each component of the BBj AI strategy. This chapter answers where things stand and what comes next -- a progress summary organized by component status, followed by a grounded forward plan.
Where We Stand
The original strategy paper (January 2025) proposed an architecture. Thirteen months later, most of that architecture is operational. The table below compares the paper's starting point with the current state.
| Component | Paper Status (Jan 2025) | Actual (Feb 2026) |
|---|---|---|
| Training data | Schema defined, no curated examples | 9,922 ChatML examples (bbjllm); 2 seed examples in training-data/ with JSON Schema validation |
| Base model | Candidates identified (CodeLlama, StarCoder2) | Qwen2.5-Coder-14B-Base recommended; bbjllm experiment validated Qwen2.5-Coder family |
| Language server | Architecture planned | v0.5.0 operational, 508 commits, 13 contributors, VS Code Marketplace |
| IDE integration | Not mentioned | Continue.dev evaluated as primary path; Copilot BYOK researched (chat only) |
| RAG database | Schema designed | Operational -- 7 parsers, 51K+ chunks, PostgreSQL + pgvector, hybrid retrieval |
| Documentation chat | Architecture planned | Operational -- Claude API + RAG, SSE streaming, source citations, auto BBj validation |
| MCP server | Not mentioned | Operational -- 2 tools (search_bbj_knowledge, validate_bbj_syntax), stdio + Streamable HTTP |
| Compiler validation | Not mentioned | Operational -- bbjcpltool v1 integrated into MCP server and web chat |
| Documentation site | Not mentioned | Operational -- Docusaurus site with 7 chapters covering full strategy |
What We Built
The components below are organized by status tier -- operational systems first, then systems running for internal exploration, then active research areas.
Operational
Language server -- v0.5.0 with 508 commits and 13 contributors, available on the VS Code Marketplace. Provides syntax highlighting, code completion, diagnostics, formatting, and code execution across bbj-vscode and bbj-intellij extensions. See Chapter 4.
Documentation site -- Docusaurus 3.9.2 site with 7 chapters covering the full BBj AI strategy, from problem statement through implementation roadmap. See Chapter 1.
Compiler validation -- bbjcpltool v1 validated and integrated into the MCP server (validate_bbj_syntax tool) and web chat (automatic code validation with 3-attempt auto-fix). Uses bbjcpl for ground-truth syntax checking. See Chapter 2.
Operational for Internal Exploration
RAG knowledge system -- 7 source parsers ingesting 51K+ chunks into PostgreSQL + pgvector with hybrid retrieval (dense vectors + BM25 + reciprocal rank fusion + cross-encoder reranking). Accessible via REST API with search, stats, and health endpoints. See Chapter 6.
MCP server -- 2 operational tools: search_bbj_knowledge (semantic search across the documentation corpus) and validate_bbj_syntax (BBj compiler validation via bbjcpl). Available via stdio and Streamable HTTP transports. See Chapter 2.
Web chat -- Available at /chat on the documentation site. Claude API backend with RAG retrieval from the 51K+ chunk corpus, SSE streaming, source citations with clickable documentation links, and automatic BBj code validation. See Chapter 5.
Active Research
Fine-tuning -- The bbjllm experiment fine-tuned Qwen2.5-Coder-32B-Instruct on 9,922 ChatML examples via QLoRA. Research recommends switching to Qwen2.5-Coder-14B-Base with two-stage training (continued pretraining + instruction fine-tuning) and bbjcpl-based compile@1 evaluation. See Chapter 3.
What Comes Next
The forward plan is organized by area. These are concrete next steps, not a phased rollout.
Fine-tuning and evaluation:
- Build a compile@1 benchmark using bbjcpl to measure whether generated BBj code compiles, with a held-out test set drawn from the training data.
- Add a validation set (10% holdout), implement completion masking to stop training on prompt tokens, and switch from the 32B-Instruct model to 14B-Base.
- Deduplicate approximately 375 entries and fix approximately 60 formatting issues in the bbjllm dataset.
- Implement two-stage training: continued pretraining on raw BBj source code followed by instruction fine-tuning on ChatML examples, using Unsloth 2026.1.4.
- Quantize the fine-tuned model to GGUF Q4_K_M format and serve via Ollama for local developer use.
IDE integration:
- Train a fill-in-the-middle (FIM) variant of the BBj model to support tab completion.
- Connect the fine-tuned model to Continue.dev for chat (instruction-tuned) and autocomplete (FIM-trained).
- Implement ghost text completions via InlineCompletionItemProvider in the language server extensions, using Langium semantic context to enrich LLM prompts.
- Extend the Langium parser with generation detection to identify which BBj generation the current code belongs to.
- Build a semantic context API within the language server that assembles scope, type, and generation information for LLM prompts.
Infrastructure:
- Add a generate_bbj_code tool to the MCP server once a fine-tuned model is operational, completing the generate-validate-fix loop.
- Operational: Language server (v0.5.0, VS Code Marketplace), documentation site (7 chapters), compiler validation (bbjcpltool v1, integrated into MCP + chat)
- Operational for internal exploration: RAG knowledge system (51K+ chunks, hybrid retrieval), MCP server (2 tools, stdio + Streamable HTTP), web chat (Claude API + RAG, SSE streaming, source citations)
- Active research: Fine-tuning (14B-Base recommended, compile@1 evaluation designed, two-stage training approach)
- Planned: FIM training for tab completion, ghost text completions via InlineCompletionItemProvider, generate_bbj_code MCP tool
The preceding chapters contain the technical detail behind each component: the BBj challenge, strategic architecture, fine-tuning, IDE integration, documentation chat, and RAG database design. Together, these seven chapters form the complete BBj AI strategy -- from problem statement through operational system and forward plan.