Tool & Dependencies

This section details the development process and technical implementation of our VS Code extension for children's game development, including key challenges and solutions.

HTTP Request

OpenVINO

OpenVINO (Open Visual Inference and Neural Network Optimization) is a toolkit developed by Intel that enables high-performance deployment of deep learning models across various hardware platforms, especially CPUs and integrated GPUs (Intel, 2020).

We used OpenVINO as a core part of our FastSD CPU setup to enable fast and efficient offline image generation. Since PixelPilot is designed to run on machines without specialised GPUs, OpenVINO allowed us to optimise Stable Diffusion models for inference on standard CPUs. By reducing the number of operations needed during inference and accelerating execution, OpenVINO drastically improved our generation speed while maintaining acceptable image quality. This was a key enabler for our goal of delivering a smooth, offline-first experience for children working in low-resource environments.

HTTP Request

Flask

Flask is a lightweight and widely used Python web framework that simplifies the development of web servers and REST APIs (Pallets Projects, 2020).

In PixelPilot, we used Flask to serve local AI models, including QWEN for offline text generation. It acts as the backend that receives prompt requests from the VSCode extension and returns model outputs in real-time. Flask allowed us to spin up a minimal, responsive API quickly and integrate it with our local setup. This dependency played a critical role in making our offline architecture accessible and easy to deploy.

Transformers

The Transformers library by Hugging Face is a leading open-source framework for working with pre-trained transformer models across natural language processing (NLP), vision, and more (Wolf et al., 2020). It provides a unified API for loading and interacting with models like GPT, BERT, and Stable Diffusion-compatible architectures.

In PixelPilot, we used the Transformers library primarily for local AI model inference, including prompt-based chat completions and code generation. This was especially useful during offline development and testing. Transformers offered us flexibility in choosing from various architectures while abstracting away the complexities of tokenisation, attention mechanisms, and model decoding. It also played a vital role in bridging the gap between online models and offline fallback implementations.

Node.js

Node.js is a lightweight, event-driven JavaScript runtime built on Chrome's V8 engine. It is widely used for building scalable web applications and tooling (Node.js Foundation, 2020).

We used Node.js as the core runtime for the VS Code extension, powering most of our backend logic, including command registration, prompt construction, and API communication. It allowed us to use modern JavaScript (TypeScript) features while benefiting from rich package ecosystems like npm. Node.js also provided the foundation for modules such as node-fetch and Azure SDKs, which we used to connect with external AI APIs and local Flask servers.

Backend

For our backend, we decided to run two different servers, one that would run on port 5001 using one of the Qwen models and the other on port 8000 that runs the FastSD CPU model.

Qwen Model Setup and Optimization

Function : Setup()

In the server program, we created code to load the model and tokenizer needed for generating the game prompts/code. This code first opens up a text file called “Size.txt” which contains the size of the model that the user wishes to use. The method then uses the python platform library to check if the user has an intel processor on their device or not. If they do, it will either use the Huggingface transformers library to download the model and then optimise it using OpenVINO, or if the model is already downloaded and optimised, it will simply fetch the model. If the user does not have an intel device, then the method will download the model without optimising it. This is done to allow the model to work on all devices, regardless of whether or not they have an intel processor

Here is the code below :


# determines which size/version of the model to download and use
def setup():
    file_path = 'Size.txt'
    size = ""
    with open(file_path, 'r') as file:
        size = file.read()


    base_dir = os.path.dirname(os.path.abspath(__file__))
    model_path = os.path.join(base_dir, "Qwen2.5-Coder-" + size + "-Instruct-ov-fp16")
    model_name = "Qwen/Qwen2.5-Coder-" + size + "-Instruct"


    # Checks if the device has an intel processor. If so, it optimises the model with 
    # openvino.
    # Otherwise it just downloads the model
    if "intel" in platform.processor().lower() and sys.platform.startswith(("win", "linux")):


        #Checks if the model has already been download, otherwise downloads the model
        if os.path.isdir("Qwen2.5-Coder-" + size + "-Instruct-ov-fp16") != True:
            model = OVModelForCausalLM.from_pretrained(
                model_name,
                Export = True
            )
            model.save_pretrained("Qwen2.5-Coder-" + size + "-Instruct-ov-fp16")

        model = OVModelForCausalLM.from_pretrained(model_path)
        tokenizer = AutoTokenizer.from_pretrained(model_name)

    else:
        model = AutoModelForCausalLM.from_pretrained(model_name)
        tokenizer = AutoTokenizer.from_pretrained(model_name)
    return model, tokenizer


model, tokenizer = setup()

                

Web Servers Launch

Runner.py

This file is responsible for running both the web servers on a device. While deploying this project, the client has to package this python script into an executable for the users. They can do this by using PyInstaller. The user is advised to always keep the folders of the models provided in the same directory as the executable for it to work. This python script starts with checking if the path to the offlineModels/lcm-lora-sdv1-5 model is correctly written in the lcm-lora-models.txt that is present in the fastsdcpu/configs folder. If it is not correctly written then that means that the executable has been run for the first time or the directory that contains the executable and model folders has changed. In this case, it opens the files and writes the path to the offlineModels/lcm-lora-sdv1-5 model in fastsdcpu/configs/lcm-lora-models.txt and path to the offlineModels/dreamshaper-8 model in the fastsdcpu/configs/stable-diffusion-models.txt. This process is to configure the FastSD CPU web server to work with offline models.

Finally, it adds the commands required to run each web server, using the subprocess library, to a list and runs the processes together. It uses sys.executable to access the python interpreter. It uses a try and except block to do this to ensure that if an error occurs, the user knows about it and might have to reinstall the files. Also in the except block, if a KeyboardInterrupt error is caught that means the user manual interrupted the servers running hence we exit the subprocess that were running the web servers.

Here is some code :


try:
processes = []


# Run the FastSD-CPU API server
processes.append(subprocess.Popen(
    [sys.executable, "src/app.py", "--api"],
    cwd=os.path.join(base_dir, "fastsdcpu"),
))


# Run the web server
processes.append(subprocess.Popen(
    [sys.executable, "-m", "uvicorn", "server:app", "--host", "0.0.0.0", "--port", "5001", "--reload"],
    cwd=os.path.join(base_dir),
))


for process in processes:
    process.wait()

except KeyboardInterrupt:
    print("\nProcess interrupted by user.")
    for process in processes:
        process.terminate()
 except subprocess.CalledProcessError:
    print("Error running FastSD-CPU server or Qwen server.")
                

Qwen Models Endpoint

Generating Game Sub-Prompts

Function : Generate_code()
EndPoint : http://localhost:5001/generate

This method of the webserver takes prompts addressed to the address localhost:5001/generate. It processes the prompt and generates a series of sub-prompts which can be used to make the game. To do this, the method first sends the following prompt to the model:

“You are an AI assistant that analyzes game ideas and breaks them down into structured components. Given a game concept, identify all necessary features such as player controls, game rules, UI elements, scoring, enemy behavior, and levels. Then, generate a list of specific prompts, each describing an individual component in detail, so another AI can use them to generate code. Ensure the prompts are clear, precise, and formatted in a way that guides code generation while keeping all components modular and integratable.”

This prompt explains to the model how it is supposed to respond to the user’s input, and every time it receives a prompt from the user, this message is prompted first. This ensures that this prompt is always within the context window of the model.

Here is the code:


@app.route("/generate", methods=["POST"])
def generate_code():
    data = request.get_json()
    if not data or "prompt" not in data:
        return jsonify({"error": "Missing prompt in request"}), 400
    
    prompt = data["prompt"]
    messages = [
        {"role": "system", "content": """You are an AI assistant that analyzes game ideas and breaks them down into structured components.
            Given a game concept, identify all necessary features such as player controls, game rules, UI elements, scoring, enemy behavior, and levels.
            Then, generate a list of specific prompts, each describing an individual component in detail, so another AI can use them to generate code.
            Ensure the prompts are clear, precise, and formatted in a way that guides code generation while keeping all components modular and integratable."""},
        {"role": "user", "content": prompt}
    ]
    
    input_text = "\n".join([message["content"] for message in messages])
    inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True).to(model.device)
    
    generated_ids = model.generate(
        inputs["input_ids"],
        attention_mask=inputs.get("attention_mask"),
        max_new_tokens=512,
        temperature=0.01
    )
    
    response = tokenizer.decode(generated_ids[0],skip_special_tokens=True)
    return response  
                
Function : Generate_code1()
Endpoint : http://localhost:5001/generate1

This is another method of the webserver, which takes prompts addressed to the address localhost:5001/generate1, using the python flask library. It takes in the prompt and generates a section of code using one of the Qwen-2.5-Coder-Instruct models downloaded with the huggingface transformers library and optimised with OpenVINO (if the user has an intel processor). This method can be used in tandem with other methods to create the game. To do this, the method first sends the following prompt to the model:

“You are an AI assistant that generates game code based on structured prompts. Given a detailed prompt describing a game component, implement modular, well-documented code that follows best practices for readability, reusability, and maintainability. Ensure that each component integrates seamlessly with others to form a complete game. If the prompt specifies dependencies or interactions with other components, design the code accordingly to support smooth integration.”

This prompt explains to the model how it is supposed to respond to the user’s input, and every time it receives a prompt from the user, this message is prompted first. This ensures that this prompt is always within the context window of the model.

The two methods generate() and generate1() are designed to work in tandem to create the game. This is because the model is unable to create the entire game based on a single prompt, and thus it is susceptible to outputting unfinished code. Breaking down the code into multiple prompts allows the model to generate the entire code piece by piece, in a more modular fashion. Furthermore, this allows the user to store different sections of code in different files if they wish.

Here is the code for this function:


@app.route("/generate1", methods=["POST"])
def generate_code1():
    data = request.get_json()
    if not data or "prompt" not in data:
        return jsonify({"error": "Missing prompt in request"}), 400
    
    prompt = data["prompt"]
    messages = [
        {"role": "system", "content": """You are an AI assistant that generates game code based on structured prompts.
            Given a detailed prompt describing a game component, implement modular, well-documented code that follows best practices for readability, reusability, and maintainability.
            Ensure that each component integrates seamlessly with others to form a complete game.
            If the prompt specifies dependencies or interactions with other components, design the code accordingly to support smooth integration."""},
        {"role": "user", "content": prompt}
    ]
    
    input_text = "\n".join([message["content"] for message in messages])
    inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True).to(model.device)
    
    generated_ids = model.generate(
        inputs["input_ids"],
        attention_mask=inputs.get("attention_mask"),
        max_new_tokens=1024,
        temperature=0.01
    )
    
    response = tokenizer.decode(generated_ids[0],skip_special_tokens=True)
    return response
                  
                

Paths API Endpoint

Function : getPaths()
Endpoint : http://localhost:5001/getPaths

This method has the sole purpose of sending the paths of the local offlineModels to the extension so that the extension can use these while sending a POST request to the FastSD CPU server.


def getPaths():
    response = {
        "lcm_path": lcm_path,
        "lora_path": lora_path
    }
    return jsonify(response)
                

FastSD CPU EndPoint

Image Generation

Endpoint : http://localhost:8000/api/generate

This API endpoint is provided by the FastSD CPU model to generate an image. Sending a POST request with a body containing a JSON like the one shown below, returns the generated image as a base64 string. Example JSON :


{
    "use_offline_model": true,
    "prompt": "a cute cat",
    "width": 512,
    "height": 512,
    "num_images": 1,
    "inference_steps": 4,
    "seed": 123123,
    "use_seed": false,
    "diffusion_task": "text_to_image",
    "use_lcm_lora": true,
    "negative_prompt": "coloured background"
}
             
                

Extension

This section details the implementation of the PixelPilot VS Code extension, including its architecture, key components, and how it integrates with the backend.

Extension Entry Point (extension.ts)

The file extension.ts serves as the primary entry point for the PixelPilot Visual Studio Code extension. This file closely adheres to best practices recommended for VS Code extension development, ensuring the extension remains responsive and integrates seamlessly within the VS Code ecosystem.

Initialisation

Upon activation of the PixelPilot extension, extension.ts sets up several critical functionalities:

  • Chat Participant Creation: A custom chat participant named pixelpilot is created using VS Code's Chat API.
  • Output Channel: An output channel (PixelPilot) is established to log important information, errors, and debugging messages, providing visibility into the extension's operation for both users and developers.
Mastery Level and Model Configuration

The extension initialises global state settings for mastery levels and AI model selections:

  • Mastery Levels: Three mastery levels (NOVICE, INTERMEDIATE, PRO) are predefined with clear descriptions, allowing PixelPilot to tailor responses based on the user's expertise. The default mastery level is set to NOVICE.
  • Model Selection: Default AI models for chat completion and code generation are configured (gpt-4o). Users can customise these settings later through commands provided by the extension.
Periodic Tasks
  • Image State Management: The extension executes updateImageState() every 60 seconds to manage and refresh the state of generated images or assets, ensuring the visual content remains current for the user's game development workflow.
Commands and Interactivity

Several commands are registered to enhance user interactivity and customisation:

  • PixelPilot Walkthrough (pixelpilot.start): This command initiates an interactive walkthrough via a VS Code Webview Panel, guiding users through the extension’s features step-by-step. The walkthrough serves as a game template generator, assisting users in refining their ideas into a clear, actionable prompt for code generation. It interactively queries the user to determine key parameters, including:
    • Game Type
    • Target Platform
    • Target Controller
    • Target Framework
    • Game Description
  • Model Selection Commands:
    • pixelpilot.selectChatCompletionModel: Allows users to select their preferred AI model for chat completion tasks from our list of supported models (gpt-4o, Phi-4, Llama-3.3-70B-Instruct, DeepSeek-V3, DeepSeek-R1).
    • pixelpilot.selectCodeGenerationModel: Allows users to select their preferred AI model for code generation tasks (gpt-4o, Phi-4, Llama-3.3-70B-Instruct, DeepSeek-V3, DeepSeek-R1, Codestral-2501, Qwen-2.5-Coder).
  • Mastery Level Selection (pixelpilot.selectMasteryLevel): Allows users to select their preferred mastery level, allowing PixelPilot to tailor responses based on the user's expertise.
Lifecycle Management
  • Activation: Upon activation, the extension performs all initialisation tasks described above, ensuring smooth integration within VS Code's workflow.
  • Deactivation: Currently, the deactivate function is minimal, as explicit resource cleanup is not necessary, but it can be expanded in future updates as required.
  • This structured approach ensures clarity, maintainability, and ease of future development, aligning with standard VS Code extension development practices and fulfilling the project's MoSCoW requirements.

    
    import * as vscode from 'vscode';
    import * as path from 'path';
    import { updateImageState } from './util/imageGen';
    
    
    import { Extension } from './util/Extension';
    
    
    import { Walkthrough } from './views/walkthrough';
    
    
    export const CHAT_COMPLETION_MODELS = ['gpt-4o', 'Phi-4', 'Llama-3.3-70B-Instruct', 'DeepSeek-V3', 'DeepSeek-R1'];
    export const CODE_GENERATION_MODELS = ['gpt-4o', 'Phi-4', 'Llama-3.3-70B-Instruct', 'DeepSeek-V3', 'DeepSeek-R1', 'Codestral-2501', 'Qwen-2.5-Coder'];
    
    
    // Extension init.
    import handler from './handler';
    
    
    const PARTICIPANT = vscode.chat.createChatParticipant('pixelpilot', handler);
    
    
    export const outputChannel = vscode.window.createOutputChannel('PixelPilot');
    outputChannel.appendLine('PixelPilot output channel created.');
    
    
    export function activate(context: vscode.ExtensionContext) {
        outputChannel.show();
        
        updateImageState().catch(console.error);
    
    
            // Set up a periodic check to update the image state
        setInterval(() => {
            updateImageState().catch(console.error);
        }, 60000); // Check every 60 seconds
    
    
        // Set the default mastery levels.
        context.globalState.update('masteryLevel', 'NOVICE'); // Default mastery level.
        context.globalState.update('NOVICE', "A novice is someone who is just starting to learn game development. They are new to programming and game design concepts. They might have little to no experience with coding and are learning the basics of how to create simple games. At this level, they are learning how to use game development tools, understand basic programming concepts like variables and loops, and create simple game mechanics. Specific skills: - Understanding basic programming concepts (variables, loops, conditionals) - Using a game development environment (e.g., Scratch, MakeCode, PyGame) - Creating simple 2D games with basic mechanics (e.g., moving a character, collecting items) - Learning how to debug simple errors");
        context.globalState.update('INTERMEDIATE', "An intermediate learner has a good grasp of basic programming and game design concepts. They can create more complex games and understand how to use more advanced features of game development tools. They are comfortable with coding and can implement more sophisticated game mechanics. At this level, they are learning about game physics, animations, and user interfaces. Specific Skills: - Writing more complex code (functions, arrays, objects) - Using intermediate features of game development tools (e.g., Unity, Godot) - Creating 2D games with more advanced mechanics (e.g., physics-based interactions, animations) - Designing user interfaces and menus - Debugging and optimising game performance");
        context.globalState.update('PRO', "A pro learner is highly skilled in game development and has a deep understanding of programming and game design. They can create complex and polished games, using advanced features of game development tools. They are capable of working on larger projects and collaborating with others. At this level, they are learning about advanced game design principles, 3D game development, and integrating external libraries and assets. Specific skills: - Writing advanced code (inheritance, polymorphism, design patterns) - Using advanced features of game development tools (e.g., Unreal Engine, advanced Unity features) - Creating 3D games with complex mechanics and high-quality graphics - Implementing multiplayer functionality and networked games - Collaborating with others on game projects (version control, project management) - Integrating external libraries and assets (e.g., physics engines, AI libraries)");
    
    
        // Set the default models.
        context.globalState.update('chatCompletionModel', 'gpt-4o');
        context.globalState.update('codeGenerationModel', 'gpt-4o');
    
    
        // Update context.
        Extension.context = context;
    
    
        const startCommand = vscode.commands.registerCommand('pixelpilot.start', async () => {
    
    
        const panel = vscode.window.createWebviewPanel(
            'walkthrough',
            'PixelPilot Walkthrough',
            vscode.ViewColumn.One,
            { enableScripts: true }
        );
    
    
        const walkthrough = new Walkthrough(panel);
        walkthrough.startWalkthrough();
        });
    
    
        const selectChatCompletionModelCommand = vscode.commands.registerCommand('pixelpilot.selectChatCompletionModel', async () => {
        const selectedModel = await vscode.window.showQuickPick(CHAT_COMPLETION_MODELS, {
            placeHolder: 'Select a Chat Completion model',
        });
    
    
        if (selectedModel) {
            context.globalState.update('chatCompletionModel', selectedModel);
            vscode.window.showInformationMessage(`Selected Chat Completion model: ${selectedModel}`);
    
    
            // Update context in the custom Extension class.
            Extension.context = context;
        }
        });
    
    
        const selectCodeGenerationModelCommand = vscode.commands.registerCommand('pixelpilot.selectCodeGenerationModel', async () => {
        const selectedModel = await vscode.window.showQuickPick(CODE_GENERATION_MODELS, {
            placeHolder: 'Select a Code Generation model',
        });
    
    
        if (selectedModel) {
            context.globalState.update('codeGenerationModel', selectedModel);
            vscode.window.showInformationMessage(`Selected Code Generation model: ${selectedModel}`);
    
    
            // Update context in the custom Extension class.
            Extension.context = context;
        }
        });
    
    
        const selectMasteryLevel = vscode.commands.registerCommand('pixelpilot.selectMasteryLevel', async () => {
        const selectedMasteryLevel = await vscode.window.showQuickPick(["NOVICE", "INTERMEDIATE", "PRO"], {
            placeHolder: 'Select your mastery level in order for PixelPilot to better tailor its responses to your needs.',
        });
    
    
        if (selectedMasteryLevel) {
            context.globalState.update('masteryLevel', selectedMasteryLevel);
            vscode.window.showInformationMessage(`Selected Mastery Level: ${selectedMasteryLevel}`);
    
    
            // Update context in the custom Extension class.
            Extension.context = context;
        }
        });
    
    
        context.subscriptions.push(startCommand, selectChatCompletionModelCommand, selectCodeGenerationModelCommand, selectMasteryLevel);
    }
    
    
    export function deactivate() {}
                 
                    

    Chat Request Workflow

    The chat request workflow within the PixelPilot extension provides a structured and efficient approach to manage user interactions with the AI assistant, ensuring clarity and accuracy in responses. The workflow consists of the following key stages:

    Intent Detection

    When a user submits a chat request, the system initially classifies the intent behind the user's message. This step determines the type of assistance needed, categorising requests into the following types:

    • GENERAL CONVERSATION – for casual greetings, small talk, or non-task-related messages that do not require further clarification.
    • GENERAL ASSISTANCE – for quick questions, troubleshooting, or debugging help that does not require code refactoring.
    • EXPLAIN CONCEPT – for explaining game development or programming topics.
    • IMAGE GENERATION – for drawing, creating assets, or visual elements.
    • CODE GENERATION – for writing new scripts or game logic.
    • REFACTOR CODE – for improving or debugging existing code.
    • FURTHER CLARIFICATION NEEDED – if the user wants to perform a task, but their request is too vague or ambiguous to be categorised as EXPLAIN CONCEPT, IMAGE GENERATION, CODE GENERATION, or REFACTOR CODE.
    Context Checking

    After the intent has been identified, PixelPilot verifies whether the user has provided sufficient details to proceed with the request. This ensures essential information is present for effective task execution. The system checks for required details based on the request type, as follows:

    • GENERAL ASSISTANCE: Does the request include a specific question or issue? If debugging, does the user provide an error message or description of the issue? Does the user provide the expected outcome?
    • EXPLAIN CONCEPT: Does the request specify a concept name? Does it mention the user's prior knowledge level? Does it describe what the user wants to learn about the concept?
    • IMAGE GENERATION: Does the request specify a theme? Does it describe the key elements of the image? Does it specify an art style? Does the request specify multiple images?
    • CODE GENERATION: Does the request specify a programming language? Does it mention a framework or library (if applicable)? Does it describe the desired functionality of the script?
    • REFACTOR CODE: Does the request include a code snippet? Does it describe the specific problem the user wants to fix or improve?
    • FURTHER CLARIFICATION NEEDED: Does the request indicate that the user wants to perform a task but lacks enough details to be categorised as EXPLAIN CONCEPT, IMAGE GENERATION, CODE GENERATION, or REFACTOR CODE?
    Clarifying Questions

    If the system identifies that the provided context is insufficient, PixelPilot proactively formulates and presents clarifying questions directly within the chat interface. These questions prompt the user to supply missing or ambiguous details necessary for fulfilling the request.

    Task Execution

    Upon confirming that sufficient context has been provided, the assistant executes the request using the appropriate AI model, based on the user's previously selected configurations (e.g., GPT-4o, Phi-4, Llama, etc.). Tasks may include generating code snippets, creating visual assets, or providing detailed explanations.

    Response Delivery

    The result of the task is clearly presented back to the user within the chat interface. The responses are tailored according to the user's selected mastery level - ranging from novice-friendly guidance to advanced technical explanations - ensuring the user receives appropriate, comprehensible, and actionable feedback.

    This structured workflow not only streamlines interactions with the AI assistant but also significantly enhances user experience by providing precise, contextually accurate, and tailored assistance for each chat request.

Image Generation (ImageGen.ts)

This file contains the logic for the image generation features our extension provides. The main functions present in this file are:

Funtion initImages(progress, prompt, single)

This function takes a prompt, formats the base prompts from prompts.json and then sends them to the AI Model so that the AI Model returns a json that consists of a better described prompt that is optimized for FastSD CPU. ‘single’ is a boolean, if it is true, that means that only one description for the prompt provided needs to be returned, this is the case when the user does ‘@pixel /image {prompt}’. If it is false, it means the AI Model has to return all the individual image assets that could be used to make a game that is described in the prompt, this is the case when the user is using the walkthrough.


async function initImages(progress: vscode.Progress<{ message?: string; increment?: number }>, prompt: string, single: boolean) {
     
    progress.report({ increment: 5, message: 'Creating prompt...' });
    
    const tokenSource = new vscode.CancellationTokenSource();
    const token = tokenSource.token;
    let craftedPrompt: vscode.LanguageModelChatMessage;
    if (single) {
        craftedPrompt = vscode.LanguageModelChatMessage.User(formatPrompt(prompts.initImageDescription, prompt));
    }
    else{
        craftedPrompt = vscode.LanguageModelChatMessage.User(formatPrompt(prompts.initImageDescriptions, prompt));
    }
    progress.report({ increment: 10, message: 'Prompting AI for image generation...' });
    
    const responseStream = await askModel([craftedPrompt], token);
    
    let response = '';
    for await (const chunk of responseStream.text) { response += chunk; }
    outputChannel.appendLine(response);
    const data = JSON.parse(response);
    
    return data;

}
            
Function : generateImage(prompt)

This function mainly handles sending the POST request to the FastSD CPU server and then parses the response into a JSON and then converts the base64 string to a Buffer which contains the binary data of the image. This is the code snippet of the POST request we send and the conversion of the base64 string :


...
const response = await fetch('http://localhost:8000/api/generate', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
        use_offline_model: true,
        prompt: prompt,
        width: 512,
        height: 512,
        num_images: 1,
        inference_steps: 4,
        seed: 123123,
        use_seed: false,
        diffusion_task: "text_to_image",
        use_lcm_lora: true,
        negative_prompt : "coloured background",
        lcm_lora : {
            base_model_id : lcm_path,
            lcm_lora_id : lora_path
        }
    })
});

const data = await response.json() as { images: ImageResponse[], latency: number };

if (!data.images || data.images.length === 0) {
    throw new Error("No image data received.");
}

const imageData = data.images[0];

// Ensure imageData is a string before converting to Buffer
if (typeof imageData !== 'string') {
    throw new Error("Image data is not a string.");
}

const imageBuffer = Buffer.from(imageData, 'base64');
...
        
Function : saveImage(image, fileDestination, imageDescription) :

This function takes an image’s binary data as a Buffer and then writes and saves the file at the provided location. It also adds the fileDestination and imageDescription to the global configuration variables and then updates and saves their state.


export async function saveImage(image: Buffer, fileDestination: string, imageDescription: string) {
    const parentPath = vscode.workspace.workspaceFolders?.[0]?.uri.path || '';
    const assetsPath = path.join(parentPath, 'assets');
    const imagesPath = path.join(assetsPath, 'images');
    
    if (!fs.existsSync(assetsPath)) {
        fs.mkdirSync(assetsPath);
    }
    if (!fs.existsSync(imagesPath)) {
        fs.mkdirSync(imagesPath);
    }
    
    const fileDest = path.join(parentPath, fileDestination);
    
    fs.writeFileSync(fileDest, image);
    
    imageDescriptions.push(imageDescription);
    imageDests.push(fileDestination);
    
    saveState();
    await updateImageState();

}         
    
Function : replaceImage(progress, prompt):

When the user types ‘@pixel /image {prompt}’, this is the first function that is triggered in the handler.ts file. This function formats the ‘replaceDecision’ base prompt from prompts.json to add the prompt and the current ‘imageDescriptions [ ]’ list to ask the AI Model if the prompt provided by the user is suggesting to replace an existing image or only generating a new image. The AI Model replies with a JSON where, ‘response.replace’ is a string of either ‘True’ or ‘False’, if prompt is suggesting to replace then ‘response.oldImageDesc’ is set to the description from ‘imageDescriptions [ ]’ that is going to be replaced or else it is just set to a empty string. ‘response.newImageDesc’ is set to the description of the new image, this is done regardless of whether the prompt suggests a replacement or not. This new image description is optimized to be used with FastSD CPU. This function calls the generateThreeImages function and returns a list that consists of a list of 3 image dictionaries at index 0, the replace string boolean at index 1 and image description of the image to be replaced at index 2.


export async function replaceImage(progress: vscode.Progress<{ message?: string; increment?: number }>, prompt: string) {
    console.log('Generating images...');
    console.log(prompt);
    
    
    progress.report({ increment: 5, message: 'Creating prompt...' });
    
    
    const tokenSource = new vscode.CancellationTokenSource();
    const token = tokenSource.token;
    
    
    const craftedPrompt = vscode.LanguageModelChatMessage.User(replaceFormatPrompt(prompts.replaceDecision, prompt));
    
    
    progress.report({ increment: 10, message: 'Prompting AI for image generation...' });
    
    
    const responseStream = await askModel([craftedPrompt], token);
    
    
    let response = '';
    for await (const chunk of responseStream.text) { response += chunk; }
    const data = JSON.parse(response);
    
    
    console.log('replaceImage data:', data);
    
    
    console.log('imageDescriptions:', imageDescriptions);
    console.log('imageDests:', imageDests);
    
    const images: Dictionary[] = await generateThreeImages(progress, data.newImageDesc);
    
    
    console.log('Generated images:', images);
        
    return [images, data.replace, data.oldImageDesc];
    
    
    }         
    
Function : generateThreeImages(progress, prompt):

This function first calls the initImages function with the ‘single’ boolean true and passes on the prompt as an argument into the function. Once it gets the response from that which is basically the image description that is optimised for FastSD CPU, it loops thrice and calls the generateImage inside the loop and adds its result to a dictionary that also stores the file destination and image description which is then added to a list. So at the end, this function returns a list of dictionaries that each contain an image’s binary data, its file destination and its description.


        export async function generateThreeImages(progress: vscode.Progress<{ message?: string; increment?: number }>, prompt: string) {
            const response = await initImages(progress, prompt, true);
         
         
            const images: Dictionary[] = [];
            for (let i = 0; i < 3; i++) {
                progress.report({ increment: 10, message: `Generating image ${i + 1}...` });
         
         
                const image = await generateImage(response.imageDescription);
                const imageDict: Dictionary = {
                    "image": image,
                    "fileDestination": response.fileDestination,
                    "imageDescription": response.imageDescription
                };
         
         
                images.push(imageDict);
           
            }
            return images;
         }               
    
Function : changeCodeImage(oldImageDesc) and replaceImageReferences(oldImage, newImage):

After the user selects one of the 3 images, the changeCodeImage function is called in the handler.ts in the case where the original prompt entered into the chat participant suggested to replace an image. This function deletes the image that has to be replaced by using the old image description provided and calls the replaceImageReferences function. The replaceImageReferences function goes through every file in the src folder to change the file destination of the old image to the file destination of the new image.


export async function changeCodeImage(oldImageDesc: string) {
    const index = imageDescriptions.indexOf(oldImageDesc);
    if (index !== -1) {
    
    
        // Delete the old image file
        if (fs.existsSync(imageDests[index])) {
            fs.unlinkSync(imageDests[index]);
        }
    
    
        // Replace image references in code
        await replaceImageReferences(imageDests[index], imageDests[imageDests.length - 1]);
    
    
        // Remove the old image description and destination
        imageDescriptions.splice(index, 1);
        imageDests.splice(index, 1);
    
    
        // Save the updated state
        saveState();
        await updateImageState();
    } else {
        console.log('Old image description not found.');
    }
    }
    
    async function replaceImageReferences(oldImage: string, newImage: string) {
    const files = await vscode.workspace.findFiles('src/**', '**/node_modules/**');
    for (const file of files) {
        const filePath = file.fsPath;
        const fileContent = await fs.promises.readFile(filePath, 'utf8');
    
    
        if (fileContent.includes(oldImage)) {
            console.log('Old image found in:', filePath);
            const updatedContent = fileContent.replace(new RegExp(oldImage, 'g'), newImage);
            await fs.promises.writeFile(filePath, updatedContent, 'utf8');
            outputChannel.appendLine(`Updated references in ${filePath}`);
        }
    }
    }                    
    

Find Youtube Videos (findVideos.ts)

Function : getCoreFundementals(progress, editor):

This function is also called in the handler.ts, it takes an active editor as an argument and then uses the active editor to obtain the selected code. This selected code is in the form of a string and is used to format the initCodeFundamentals base prompt from prompts.json. This new prompt is now sent to an AI Model to obtain the core programming concepts used in the code provided. The ‘response.fundamentals’ is a list of the core concepts. This list is then used to format the explainConceptsToChildren base prompt from prompts.json. This new prompt is sent to the AI Model to obtain basic explanations of the concepts, 3 youtube videos on each concept and a summary of each video.


export async function getCoreFundamentals(progress: vscode.Progress<{ message?: string; increment?: number }>, editor?: vscode.TextEditor) {


    if (editor) {
        const selection = editor.selection;
        const selectedText = editor.document.getText(selection);
    
    
        if (selectedText) {
    
    
            const tokenSource = new vscode.CancellationTokenSource();
            const token = tokenSource.token;
            const craftedPrompt = vscode.LanguageModelChatMessage.User(prompts.initCodeFundamentals.replace('{}', selectedText));
            const responseStream = await askModel([craftedPrompt], token);
            
            let response = '';
            for await (const chunk of responseStream.text) { response += chunk; }
            outputChannel.appendLine(response);
            const data = JSON.parse(response);
    
    
            const newCraftedPrompt = vscode.LanguageModelChatMessage.User(prompts.explainConceptsToChildren.replace('{}', data.fundamentals).toString());
            const newResponseStream = await askModel([newCraftedPrompt], token);
            
            let newResponse = '';
            for await (const chunk of newResponseStream.text) { newResponse += chunk; }
            if (newResponse.startsWith('```json') && newResponse.endsWith('```')) {
                newResponse = newResponse.slice(7, -3).trim();
            }
            outputChannel.appendLine(newResponse);
            const newData = JSON.parse(newResponse);
            
            return newData.concepts;
        }
    }
    return [];
    
    
    }         
    

NEXT_LEVEL.EXE

Ready to See Our Testing Process?

Discover how we rigorously tested Pixel Pilot to ensure quality, usability, and educational effectiveness.