Skip to main content

The BBj Challenge

TL;DR

BBj is a programming language with a 40+ year lineage spanning four distinct generations -- from 1980s character terminals to modern browser applications. Generic LLMs like ChatGPT and GitHub Copilot have essentially zero BBj training data and will hallucinate incorrect code when asked for help. A custom fine-tuned model is the only viable path to AI-powered BBj development tools.

When a developer asks an AI assistant to "create a button in BBj," the response should depend on which of BBj's four generations they are targeting. Instead, every mainstream LLM produces the same result: fabricated code that looks vaguely like Visual Basic and does not compile. This is not a prompt engineering problem. It is a training data problem -- and it demands a fundamentally different approach to AI tooling.

BBj is not simply an obscure language. It is the living descendant of Business BASIC, a lineage that began on minicomputers in the late 1970s and has been continuously evolved by BASIS International through four major generations. Each generation introduced new paradigms -- terminal I/O, Windows-style GUI, Java Swing integration, browser rendering -- while preserving full backward compatibility. The result is a uniquely layered language where code from 1985 and code from 2025 coexist in the same runtime.

This chapter examines what makes BBj fundamentally different from languages that LLMs handle well, and why that difference demands a purpose-built AI strategy.

Four Generations of BBj

BBj's evolution is not the typical story of a language replacing its predecessor. Each generation added capabilities on top of the previous ones, and all four remain valid, compilable BBj code today. A single enterprise codebase may contain modules written across three decades.

Character UI (1980s -- present)

The original generation. Programs run in a terminal and use @(x,y) cursor positioning for all screen layout. Mnemonics control terminal behavior -- clearing the screen, setting colors, positioning the cursor. There are no windows, no buttons, no mouse events. Everything is PRINT and INPUT.

Character UI -- Terminal I/O
PRINT 'CS',
PRINT @(10,5),"Customer Name: ",
INPUT name$
PRINT @(10,7),"Hello, ",name$,"!"

Character-mode programs are still in active production at enterprises running BBj-based ERP and accounting systems. Developers maintaining these systems need AI that understands @(x,y) positioning, mnemonic codes, and the procedural GOSUB/RETURN flow that predates structured programming.

Visual PRO/5 (1990s -- 2000s)

The first graphical generation. Visual PRO/5 introduced Windows-style GUI elements through a SYSGUI channel and mnemonic strings: programs open a system GUI device, then use PRINT (sysgui)'WINDOW'(...) to create windows, PRINT (sysgui)'BUTTON'(...) to create controls, and CTRL(sysgui,id,index) to read control values. Programs could now display windows, buttons, menus, and dialogs.

Visual PRO/5 -- Mnemonic-Based GUI
sysgui = unt
open (sysgui)"X0"
print (sysgui)'window'(100,100,400,300,"Order Entry",$00010003$,$00000000$)
print (sysgui)'button'(201,10,250,100,30,"Save",$$)
status$ = ctrl(sysgui,201,0)

Visual PRO/5 is largely superseded by BBj GUI, but legacy code in this style still exists and occasionally needs maintenance. An AI tool that does not understand the PRINT (sysgui)'WINDOW'(...) / 'BUTTON'(...) mnemonic pattern will misinterpret this code entirely.

BBj GUI / Swing (2000s -- present)

A major architectural shift. BBj GUI introduced a Java-based object model accessed through the BBjAPI() entry point. Programs create objects, hold references with the ! suffix convention, and register callbacks by method name. Under the hood, the UI renders via Java Swing.

BBj GUI -- Object-Oriented API
sysgui! = BBjAPI().getSysGui()
window! = sysgui!.addWindow(100, 100, 800, 600, "Order Entry")
button! = window!.addButton(101, 10, 530, 100, 25, "Save")
button!.setCallback(button!.ON_BUTTON_PUSH, "handleSave")

This generation introduced BBj's class system (class/classend, method/methodend), use statements for Java interop, and field references with # (e.g., #window! inside a class). It is the dominant style for new BBj development.

DWC / Browser (2010s -- present)

DWC (Dynamic Web Client) uses the same BBjAPI as BBj GUI but renders the application in a web browser instead of a Swing window. From the developer's perspective, the code is nearly identical -- the key addition is BBjWebManager for browser-specific operations like executing JavaScript or responding to browser events.

DWC -- Browser Rendering with BBjWebManager
bui! = BBjAPI().getWebManager()
bui!.setCallback(bui!.ON_BROWSER_CLOSE, "handleClose")
index = bui!.executeAsyncScript("navigator.userAgent", 1)

DWC is the strategic direction for BBj applications. It enables deploying existing BBj GUI code to the browser with minimal changes, while also exposing browser-native capabilities. New applications increasingly target DWC from the start.

Reference: GUI Programming in BBj

The code examples in this chapter follow patterns documented in the Guide to GUI Programming in BBj (GuideToGuiProgrammingInBBj.pdf) by BASIS International, which covers the complete evolution of BBj GUI APIs from character mnemonics through the BBjAPI object model. The guide is available from BASIS International documentation.

Cross-Generation Code Comparison

The argument for custom AI tooling becomes visceral when you see the same task implemented across all four generations. Consider two fundamental operations: displaying a greeting and handling user interaction.

Task: Display a Greeting

Character UI (1980s)
PRINT 'CS',
PRINT @(10,3),"Welcome to Order Entry"
PRINT @(10,5),"Press any key to continue..."
INPUT x$

Task: Read a Customer Record

Character UI (1980s)
PRINT @(0,2),"Customer ID: ",
INPUT custId$
OPEN (1)"customers.dat"
READ (1, KEY=custId$, err=*next) name$, balance
PRINT @(0,4),"Name: ",name$
CLOSE (1)

Notice how the file I/O core (OPEN, READ, CLOSE) is identical across generations -- it is the UI layer and variable access patterns that diverge. An AI model must understand both the universal core and the generation-specific surface syntax to be useful.

Universal vs. Generation-Specific Syntax

BBj's shared core is substantial. String handling, numeric operations, control flow, file I/O, and error handling work identically whether you are writing a character-mode program or a DWC application. The generations diverge at the UI and interaction layer.

ScopeApplies ToExamples
UniversalAll generationsFOR/NEXT, IF/THEN/ELSE, GOSUB/RETURN, string functions ($ suffix), file I/O (OPEN, READ, WRITE), SETERR, PRINT (non-UI)
Modern BBjBBj GUI + DWCclass/classend, method/methodend, use, object references (! suffix), BBjAPI(), field references (# prefix)
Any GUIVisual PRO/5 + BBj GUI + DWCWindow and control concepts (syntax varies by generation)
DWC OnlyDWCBBjWebManager, executeAsyncScript, browser events
Legacy GUIVisual PRO/5PRINT (sysgui)'WINDOW'(...), PRINT (sysgui)'BUTTON'(...), CTRL(sysgui,id,index)
Character OnlyCharacter UI@(x,y) positioning, terminal mnemonics ('CS', 'SF', 'SB')

This distinction matters enormously for AI tooling. A model that understands only modern BBj syntax will fail on legacy character-mode code. A model trained only on universal patterns will not know how to create a window. The training data must be generation-tagged so the model learns which patterns apply where.

The Generic LLM Problem

Modern LLMs are trained on billions of lines of public code -- Python, JavaScript, Java, C++, and hundreds of other languages represented on GitHub and Stack Overflow. BBj is not among them. The language is proprietary to BASIS International, with no significant open-source corpus, no Stack Overflow tag with thousands of answers, and no representation in any known LLM training dataset.

The result is not merely ignorance -- it is confident hallucination:

User: "How do I create a button in BBj?"

Generic LLM response (hallucinated):
Dim button As New BBjButton
button.Caption = "Click Me"
button.OnClick = Sub()
MsgBox("Hello")
End Sub

Correct BBj:
button! = window!.addButton(101, 10, 10, 100, 25, "Click Me")
button!.setCallback(button!.ON_BUTTON_PUSH, "handleClick")

The LLM does not understand:

  • Object references ending with ! (e.g., window!, button!) -- a BBj convention with no equivalent in mainstream languages
  • String variables ending with $ (e.g., name$, custId$) -- inherited from BASIC but absent from modern LLM training data
  • Field references starting with # (e.g., #window!, #this!) -- class instance field access
  • The callback pattern using string method names ("handleClick") rather than function references or lambdas
  • Error handling via err=*next and SETERR rather than try/catch
  • Any actual BBjAPI method -- addWindow, addButton, getSysGui, getWebManager, and hundreds of others

RAG (Retrieval-Augmented Generation) can supplement a model's knowledge with documentation context, but it cannot fix fundamental language incomprehension. A model that does not understand BBj's type system, variable naming conventions, or object model will misinterpret even perfect documentation snippets injected into its context window.

Decision: Custom Fine-Tuned Model Required

Choice: Fine-tune a dedicated BBj language model rather than relying on prompt engineering or RAG-only approaches with generic LLMs.

Rationale: Base LLMs have essentially no BBj training data. Testing shows they hallucinate VB/VBA-style code when asked about BBj. RAG can supplement but cannot replace fundamental language comprehension. Fine-tuning teaches the model BBj's syntax, idioms, and APIs at the weight level.

Alternatives considered: Prompt engineering with few-shot examples (insufficient for code completion), RAG-only approach (fails on syntax-level understanding), waiting for future LLMs to learn BBj naturally (unlikely given BBj's proprietary nature and small community).

Status: Training data structure defined (9,922 ChatML examples in bbjllm); Qwen2.5-Coder-14B-Base recommended as next model. See Fine-Tuning.

Why GitHub Copilot Will Not Work

A common question: "Can we just use GitHub Copilot?" The short answer is no -- Copilot is a closed system that does not allow custom models.

AspectGitHub CopilotWhat BBj Needs
ModelOpenAI Codex/GPT (fixed)Fine-tuned BBj model
CustomizationNone for base completionsFull control over model weights
Training dataPublic GitHub reposCustom BBj training corpus
HostingGitHub cloud onlySelf-hosted option for data privacy

Copilot does offer extensibility features -- Copilot Extensions for chat plugins, Copilot for Business with policy controls, and the VS Code Language Model API for requesting completions. None of these change the underlying model. The base code completion engine has no meaningful BBj training data, and no amount of context injection through these extension points will teach it the language.

The fundamental mismatch: Copilot's model will suggest Dim instead of declare, invent method names that do not exist, confuse BBj with VB/VBA, and misinterpret the !, $, and # suffixes that are central to BBj's type system.

Building a custom VSCode extension with our own fine-tuned model is not competing with Copilot -- it is filling a gap that Copilot structurally cannot fill.

The webforJ Contrast

BASIS International also develops webforJ, a Java-based web framework. The AI story for webforJ is entirely different -- and the contrast illustrates exactly why BBj needs special treatment.

AspectwebforJBBj
Base languageJavaBBj (Business BASIC)
LLM training dataBillions of examplesNear zero
Generic LLM comprehensionExcellent -- understands Java nativelyNone -- hallucinates wrong syntax
What AI needsFramework-specific context via RAGFundamental language understanding via fine-tuning
Fine-tuning required?NoYes -- absolutely essential
Generic tools (Copilot, etc.)Work reasonably wellActively harmful -- teach wrong patterns

For webforJ, providing an MCP (Model Context Protocol) server that gives AI assistants access to framework documentation is sufficient. The base LLM already understands Java; we just supplement with webforJ-specific API knowledge.

For BBj, we must teach the model the language itself before any framework or API knowledge becomes useful. This is a fundamentally harder problem, but it is also what makes the investment worthwhile -- no one else will solve it, and the developers who depend on BBj have no alternative.

Current Status

Where Things Stand

The strategy described in this document is being actively implemented across multiple fronts.

  • Operational: The bbj-language-server (v0.5.0) brings Langium-powered IDE features to BBj developers via the VS Code Marketplace -- syntax highlighting, code completion, and diagnostics. See IDE Integration for the AI-powered extension roadmap.
  • Operational for internal exploration: A RAG-based knowledge system with 51K+ documentation chunks, a REST API, an MCP server (search_bbj_knowledge and validate_bbj_syntax tools), and a web chat interface powered by Claude API with SSE streaming and source citations.
  • Operational: Compiler validation via bbjcpltool integrates with both the MCP server and web chat, providing automatic BBj syntax checking.
  • Active research: Fine-tuning a BBj-specialized code model -- the bbjllm experiment has 9,922 training examples on Qwen2.5-Coder-32B-Instruct; research recommends moving to a 14B-Base model with a two-stage training approach.
  • Active research: Copilot BYOK integration remains in early exploration; Continue.dev is being evaluated as the primary IDE integration path.
  • Validated: Testing with ChatGPT, Claude, and GitHub Copilot confirms that all major LLMs produce incorrect BBj code. The hallucination pattern is consistent -- they default to VB/VBA-style syntax.
  • webforJ AI tools operational: The MCP server approach is working for webforJ, validating the broader infrastructure design. BBj is the remaining -- and harder -- challenge.

The chapters that follow detail each component of the strategy: the unified architecture, model fine-tuning, IDE integration, documentation chat, RAG database design, and the implementation roadmap.