1. Cursor Chat: Plan

I want to build a one script file, a JavaScript file that uses my open AI key to plan my daily protein from everything I ate in the day. Assume that everything we ate will be provided in one message, so no need to store long running chats. Each one will be completely independent. There's no project structure. Just use a single file. I will run it from my command line. 

Let's just use fetch requests so no OpenAI packages necesssary. Dotenv is good. Let's enter the prompt directly in the file, we're just running the file from the command line.

First, I want you to plan out the feature. Please don't code anything yet.

2. Cursor Agent: Use Docs

Please use the documentation below. I want to use the GPT40 mini model.

Text generation and prompting

=============================

Learn how to prompt a model to generate text.

With the OpenAI API, you can use a [large language model](/docs/models) to generate text from a prompt, as you might using [ChatGPT](<https://chatgpt.com>). Models can generate almost any kind of text response—like code, mathematical equations, structured JSON data, or human-like prose.

Here's a simple example using the [Chat Completions API](/docs/api-reference/chat).

Generate text from a simple prompt

```javascript

import OpenAI from "openai";

const client = new OpenAI();

const completion = await client.chat.completions.create({

model: "gpt-4.1",

messages: [

{

role: "user",

content: "Write a one-sentence bedtime story about a unicorn.",

},

],

});

console.log(completion.choices[0].message.content);


from openai import OpenAI

client = OpenAI()

completion = client.chat.completions.create(

model="gpt-4.1",

messages=[

{

"role": "user",

"content": "Write a one-sentence bedtime story about a unicorn."

}

]

)

print(completion.choices[0].message.content)


curl "<https://api.openai.com/v1/chat/completions>" \\

-H "Content-Type: application/json" \\

-H "Authorization: Bearer $OPENAI_API_KEY" \\

-d '{

"model": "gpt-4.1",

"messages": [

{

"role": "user",

"content": "Write a one-sentence bedtime story about a unicorn."

}

]

}'

An array of content generated by the model is in the choices property of the response. In this simple example, we have just one output which looks like this:


[

{

"index": 0,

"message": {

"role": "assistant",

"content": "Under the soft glow of the moon, Luna the unicorn danced through fields of twinkling stardust, leaving trails of dreams for every child asleep.",

"refusal": null

},

"logprobs": null,

"finish_reason": "stop"

}

]

In addition to plain text, you can also have the model return structured data in JSON format - this feature is called Structured Outputs.

Choosing a model


A key choice to make when generating content through the API is which model you want to use - the model parameter of the code samples above. You can find a full listing of available models here. Here are a few factors to consider when choosing a model for text generation.

When in doubt, gpt-4.1 offers a solid combination of intelligence, speed, and cost effectiveness.

Prompt engineering


Because the content generated from a model is non-deterministic, it is a combination of art and science to build a prompt that will generate content in the format you want. However, there are a number of techniques and best practices you can apply to consistently get good results from a model.

Some prompt engineering techniques will work with every model, like using message roles. But different model types (like reasoning versus GPT models) might need to be prompted differently to produce the best results. Even different snapshots of models within the same family could produce different results. So as you are building more complex applications, we strongly recommend that you:

Now, let's examine some tools and techniques available to you to construct prompts.

Message roles and instruction following


You can provide instructions (prompts) to the model with differing levels of authority using message roles.

Generate text with messages using different roles


import OpenAI from "openai";

const client = new OpenAI();

const completion = await client.chat.completions.create({

model: "gpt-4.1",

messages: [

{

role: "developer",

content: "Talk like a pirate."

},

{

role: "user",

content: "Are semicolons optional in JavaScript?",

},

],

});

console.log(completion.choices[0].message);


from openai import OpenAI

client = OpenAI()

completion = client.chat.completions.create(

model="gpt-4.1",

messages=[

{

"role": "developer",

"content": "Talk like a pirate."

},

{

"role": "user",

"content": "Are semicolons optional in JavaScript?"

}

]

)

print(completion.choices[0].message.content)


curl "<https://api.openai.com/v1/chat/completions>" \\

-H "Content-Type: application/json" \\

-H "Authorization: Bearer $OPENAI_API_KEY" \\

-d '{

"model": "gpt-4.1",

"messages": [

{

"role": "developer",

"content": "Talk like a pirate."

},

{

"role": "user",

"content": "Are semicolons optional in JavaScript?"

}

]

}'

The OpenAI model spec describes how our models give different levels of priority to messages with different roles.

|developer|user|assistant|

|---|---|---|

|developer messages are instructions provided by the application developer, prioritized ahead of user messages.|user messages are instructions provided by an end user, prioritized behind developer messages.|Messages generated by the model have the assistant role.|

A multi-turn conversation may consist of several messages of these types, along with other content types provided by both you and the model. Learn more about managing conversation state here.

You could think about developer and user messages like a function and its arguments in a programming language.

Message formatting with Markdown and XML


When writing developer and user messages, you can help the model understand logical boundaries of your prompt and context data using a combination of Markdown formatting and XML tags.

Markdown headers and lists can be helpful to mark distinct sections of a prompt, and to communicate hierarchy to the model. They can also potentially make your prompts more readable during development. XML tags can help delineate where one piece of content (like a supporting document used for reference) begins and ends. XML attributes can also be used to define metadata about content in the prompt that can be referenced by your instructions.

In general, a developer message will contain the following sections, usually in this order (though the exact optimal content and order may vary by which model you are using):

Below is an example of using Markdown and XML tags to construct a developer message with distinct sections and supporting examples.

Example prompt

A developer message for code generation


# Identity

You are coding assistant that helps enforce the use of snake case

variables in JavaScript code, and writing code that will run in

Internet Explorer version 6.

# Instructions

- When defining variables, use snake case names (e.g. my_variable)

instead of camel case names (e.g. myVariable).

- To support old browsers, declare variables using the older

"var" keyword.

- Do not give responses with Markdown formatting, just return

the code as requested.

# Examples

<user_query>

How do I declare a string variable for a first name?

</user_query>

<assistant_response>

var first_name = "Anna";

</assistant_response>

API request

Send a prompt to generate code through the API


import fs from "fs/promises";

import OpenAI from "openai";

const client = new OpenAI();

const instructions = await fs.readFile("prompt.txt", "utf-8");

const response = await client.responses.create({

model: "gpt-4.1",

instructions,

input: "How would I declare a variable for a last name?",

});

console.log(response.output_text);


from openai import OpenAI

client = OpenAI()

with open("prompt.txt", "r", encoding="utf-8") as f:

instructions = f.read()

response = client.responses.create(

model="gpt-4.1",

instructions=instructions,

input="How would I declare a variable for a last name?",

)

print(response.output_text)


curl <https://api.openai.com/v1/responses> \\

-H "Authorization: Bearer $OPENAI_API_KEY" \\

-H "Content-Type: application/json" \\

-d '{

"model": "gpt-4.1",

"instructions": "'"$(< prompt.txt)"'",

"input": "How would I declare a variable for a last name?"

}'

Save on cost and latency with prompt caching

When constructing a message, you should try and keep content that you expect to use over and over in your API requests at the beginning of your prompt, and among the first API parameters you pass in the JSON request body to Chat Completions or Responses. This enables you to maximize cost and latency savings from prompt caching.

Few-shot learning


Few-shot learning lets you steer a large language model toward a new task by including a handful of input/output examples in the prompt, rather than fine-tuning the model. The model implicitly "picks up" the pattern from those examples and applies it to a prompt. When providing examples, try to show a diverse range of possible inputs with the desired outputs.

Typically, you will provide examples as part of a developer message in your API request. Here's an example developer message containing examples that show a model how to classify positive or negative customer service reviews.


# Identity

You are a helpful assistant that labels short product reviews as

Positive, Negative, or Neutral.

# Instructions

- Only output a single word in your response with no additional formatting

or commentary.

- Your response should only be one of the words "Positive", "Negative", or

"Neutral" depending on the sentiment of the product review you are given.

# Examples

<product_review id="example-1">

I absolutely love this headphones — sound quality is amazing!

</product_review>

<assistant_response id="example-1">

Positive

</assistant_response>

<product_review id="example-2">

Battery life is okay, but the ear pads feel cheap.

</product_review>

<assistant_response id="example-2">

Neutral

</assistant_response>

<product_review id="example-3">

Terrible customer service, I'll never buy from them again.

</product_review>

<assistant_response id="example-3">

Negative

</assistant_response>

Include relevant context information


It is often useful to include additional context information the model can use to generate a response within the prompt you give the model. There are a few common reasons why you might do this:

The technique of adding additional relevant context to the model generation request is sometimes called retrieval-augmented generation (RAG). You can add additional context to the prompt in many different ways, from querying a vector database and including the text you get back into a prompt, or by using OpenAI's built-in file search tool to generate content based on uploaded documents.

Planning for the context window

Models can only handle so much data within the context they consider during a generation request. This memory limit is called a context window, which is defined in terms of tokens (chunks of data you pass in, from text to images).

Models have different context window sizes from the low 100k range up to one million tokens for newer GPT-4.1 models. Refer to the model docs for specific context window sizes per model.

Prompting GPT-4.1 models


GPT models like gpt-4.1 benefit from precise instructions that explicitly provide the logic and data required to complete the task in the prompt. GPT-4.1 in particular is highly steerable and responsive to well-specified prompts. To get the most out of GPT-4.1, refer to the prompting guide in the cookbook.

[

GPT-4.1 prompting guide

Get the most out of prompting GPT-4.1 with the tips and tricks in this prompting guide, extracted from real-world use cases and practical experience.

](https://cookbook.openai.com/examples/gpt4-1_prompting_guide)

GPT-4.1 prompting best practices

While the cookbook has the best and most comprehensive guidance for prompting this model, here are a few best practices to keep in mind.

Building agentic workflows

System Prompt Reminders

In order to best utilize the agentic capabilities of GPT-4.1, we recommend including three key types of reminders in all agent prompts for persistence, tool calling, and planning. As a whole, we find that these three instructions transform the model's behavior from chatbot-like into a much more “eager” agent, driving the interaction forward autonomously and independently. Here are a few examples:


## PERSISTENCE

You are an agent - please keep going until the user's query is completely

resolved, before ending your turn and yielding back to the user. Only

terminate your turn when you are sure that the problem is solved.

## TOOL CALLING

If you are not sure about file content or codebase structure pertaining to

the user's request, use your tools to read files and gather the relevant

information: do NOT guess or make up an answer.

## PLANNING

You MUST plan extensively before each function call, and reflect

extensively on the outcomes of the previous function calls. DO NOT do this

entire process by making function calls only, as this can impair your

ability to solve the problem and think insightfully.

Tool Calls

Compared to previous models, GPT-4.1 has undergone more training on effectively utilizing tools passed as arguments in an OpenAI API request. We encourage developers to exclusively use the tools field of API requests to pass tools for best understanding and performance, rather than manually injecting tool descriptions into the system prompt and writing a separate parser for tool calls, as some have reported doing in the past.

Diff Generation

Correct diffs are critical for coding applications, so we've significantly improved performance at this task for GPT-4.1. In our cookbook, we open-source a recommended diff format on which GPT-4.1 has been extensively trained. That said, the model should generalize to any well-specified format.

Using long context

GPT-4.1 has a performant 1M token input context window, and will be useful for a variety of long context tasks, including structured document parsing, re-ranking, selecting relevant information while ignoring irrelevant context, and performing multi-hop reasoning using context.

Optimal Context Size

We show perfect performance at needle-in-a-haystack evals up to our full context size, and we've observed very strong performance at complex tasks with a mix of relevant and irrelevant code and documents in the range of hundreds of thousands of tokens.

Delimiters

We tested a variety of delimiters for separating context provided to the model against our long context evals. Briefly, XML and the format demonstrated by Lee et al. (ref) tend to perform well, while JSON performed worse for this task. See our cookbook for prompt examples.

Prompt Organization

Especially in long context usage, placement of instructions and context can substantially impact performance. In our experiments, we found that it was optimal to put critical instructions, including the user query, at both the top and the bottom of the prompt; this elicited marginally better performance from the model than putting them only at the top, and much better performance than only at the bottom.

Prompting for chain of thought

As mentioned above, GPT-4.1 isn’t a reasoning model, but prompting the model to think step by step (called “chain of thought”) can be an effective way for a model to break down problems into more manageable pieces. The model has been trained to perform well at agentic reasoning and real-world problem solving, so it shouldn’t require much prompting to do well.

We recommend starting with this basic chain-of-thought instruction at the end of your prompt:


First, think carefully step by step about what documents are needed to answer the query. Then, print out the TITLE and ID of each document. Then, format the IDs into a list.

From there, you should improve your CoT prompt by auditing failures in your particular examples and evals, and addressing systematic planning and reasoning errors with more explicit instructions. See our cookbook for a prompt example demonstrating a more opinionated reasoning strategy.

Instruction following

GPT-4.1 exhibits outstanding instruction-following performance, which developers can leverage to precisely shape and control the outputs for their particular use cases. However, since the model follows instructions more literally than its predecessors, may need to provide more explicit specification around what to do or not do, and existing prompts optimized for other models may not immediately work with this model.

Recommended Workflow

Here is our recommended workflow for developing and debugging instructions in prompts:

Common Failure Modes

These failure modes are not unique to GPT-4.1, but we share them here for general awareness and ease of debugging.

See our cookbook for an example customer service prompt that demonstrates these principles.

Prompting reasoning models


There are some differences to consider when prompting a reasoning model versus prompting a GPT model. Generally speaking, reasoning models will provide better results on tasks with only high-level guidance. This differs from GPT models, which benefit from very precise instructions.

You could think about the difference between reasoning and GPT models like this.

For more information on best practices when using reasoning models, refer to this guide.

Next steps


Now that you known the basics of text inputs and outputs, you might want to check out one of these resources next.

[

Build a prompt in the Playground

Use the Playground to develop and iterate on prompts.

](/playground)[

Generate JSON data with Structured Outputs

Ensure JSON data emitted from a model conforms to a JSON schema.

](/docs/guides/structured-outputs)[

Full API reference

Check out all the options for text generation in the API reference.

](/docs/api-reference/responses)

Was this page useful?


### Pre-Requisites

- Cursor installed

### Lesson

### CMD + I

- Cursor Agent to intialize a NextJS project, with TailwindCSS
- Initialize a Supabase directory with a sample edge function that we test out
- Initialize a Supabase test migration with a sample public database with some seed data.
- Test that our frontend can hook up to this sample database

### Mm CMD + L

- to tell me what the NextJS repository works, and add the cha  c t-demo page.
    - @ keyword to tag files
    - @ NextJS documentation
    - @ Screenshot an image directly of the current chat-demo page, and ask it to modernize it.
    - @ Add OpenAI Documentation which we’ll need later.

### CMD + K

- to refactor code to use Tailwind CSS.
- to add error handling