OpenAI Codex Essentials Certification Course

This course provides essential training for the OpenAI Codex certification exam, covering practical features and hands-on labs to help developers master AI coding tools. It includes free practice exams and prepares learners for AI programming jobs by teaching how to effectively utilize Codex for development tasks, automation, and workflow integration.

freeCodeCamp.org

freeCodeCamp.org

Full English Transcript of: OpenAI Codex Essentials – AI Coding Agent

Hey, this is Andrew Brown, your favorite tech instructor, bringing you another certification course. And this time, it's the Exam Pro Codeex Essentials Certification. How are we going to pass it? We're going to do the thing that we always do. We're going to do lectures, hands-on labs, and our own account. And as always, I'm going to provide you a free practice exam uh so that you can go ace that exam. Get it, put on your resume, LinkedIn, go get that AI coding job you've been looking for. Um, and if you love this kind of materials, the best way to support it is to purchase the additional paid materials over at exampro.co.

You will see the address all over this course. You get additional things like practice exams, cheat sheets, uh, support, and more. Uh, and the key thing is that we continue to update these courses. So even when they're released for free, we reserve some of the content and we put it only on the Exampro platform and then we continually update it. Okay? So uh you know really support that because it's becoming harder to make these courses and I want to keep making free courses. Uh so you got to help me out here. Okay. Um if you do not know me, I've taught so many courses. So we've done claude, we've done adabus, azure, gcp, kubernetes, terraform, nvidia, uh, and more like 50 courses. So

you are in really good hands. Uh, and I hope you enjoy this codeex course. Let's jump into it. Hey, it's Andrew Brown and we are at the start of our journey asking the most important question first, which is what is the codeex? essentials. So, the Exam Pro codec essentials is a certification course that walks you through all the features of codecs, has a strong focus on practical, hands-on knowledge, and we do not shy away from what doesn't work. So, we're going to have a clear picture of how to utilize this tool. The exam code for this certification is EXP

codeex01. So, if this code changes, that means that there's been a new certification. And so just be aware of that code in case it has changed. So consider the XP codeex if you want to explore all the features of codec in the shortest possible amount of time so you can fully realize the product. You need practical knowledge over sales knowledge or you're preparing for the claude code boot camp. You might say why did you put cla boot camp? Well, that's because originally I made a Claude Code Essentials course.

Um, and then there was um demand to make the Codeex course and so currently I have pending a boot camp, a project-based boot camp. Um, and so we're still just calling it the Cloud Code Boot Camp. And we're probably going to let you use whichever um uh tool you want and I just haven't changed the name. So you can think of it as codeex boot camp, cloud code boot camp. I don't have a name for it, but we'll continue on to the road map to see how all this stuff fits in. So over here we have codeex and the codeex course here is um over here, sorry, it is shorter than the cloud code course and the reason why is just there's less going on in codeex.

Okay. Um and so you know I couldn't make it a 12-h hour course. It's basically a four and a half to five hour course content. Um and there's a lot of similarities here. So, if you've already done the cloud uh cloud code course, you're going to find Codeex uh very easy and straightforward to do. Um and so it'll be a really fast and good ramp up um to just add more tooling uh to your belt. If you have yet to do so, you might want to also do the Genai essentials. This will give you foundational knowledge across the board, not just in um AI tooling, but in AI in general. U the cloud certified architect course, uh we have that as well. And so this thing I know again it's for claude

but the important part is more so the orchestration and prompting uh in that particular course and that can be applied to any of these tools. It can be applied to cloud codeex gemini cli and so um you know not all the content in there is useful if you've done the cloud code essentials but definitely domains one two and possibly five are extremely valuable that could be applied everywhere else. Maybe at some point we will generalize that content out of there so that it's not cloudsp specific. Um but it's still good content nonetheless. Uh we have the claude code from zero which probably will have a codeex from zero, a cloud code from zero, a Gemini CLI from zero, but we got

to name it that way because most people entering into um these tools come through a very specific one. But this will be for people with zero tech background. So if you need all the steps explained, it's going to be here and probably will have a subdomain that will be like from zero. And so that thing is definitely going to get developed. There's huge interest in this. So if you're watching this and you're going, Andrew, I can't handle what you're doing in the cloud code essentials or the Codex Essentials. I'm not a developer.

Well, do not worry because we will take care of you in the uh from zero courses. Uh but just understand that from zero courses are charged at a higher rate because they require a um a certain level of production and so that's just what they are. So if you are from zero and you're not from a developer background or a tech background um then you know just understand that's what it's going to be. Okay. But the idea here is that you know once we have our fundamentals content we can roll it in into a projectbased boot camp so we can apply that knowledge so that we know what it will be. Again, you know, we have this is the claude code camp, but we will probably generalize it for all

three providers as we're finding that people are working across those three and we need to service those three. And I don't have the Gemini CLI badge in here yet, but it'll obviously be in this road map when we update it for the Gemini CLI. Um, and you know, there is interest in doing a professionals uh enterprise architect for enterprises. Right now enterprises are interested in these toolings but um unless you're in the states there are some restrictions in terms of uh usage. So as um governance and security improves there's strong consideration that I might produce an enterprise architect course. This is something that I would come in and do at organizations.

So if you're a large organization and you want to know how to apply these tools at the org level or uh at your government level then you got to reach out separately for that. That's not for regular consumers. Okay. Um but I put it here on the road map just in case you're interested. There is um interested in like an AI security certification. I don't have that yet. I will consider it. Um but this is the road map here. And really you're going to get most of the value is going to be doing uh these four here. And again this one here is for ultra beginners. Okay. So how long should you study to pass? This is a

short course but if you're a beginner look at 12 hours, look at 20 hours. Um, uh, I think that you should start with clawed code first to be honest because it is just more robust, has more features, um, and you're just going to get a better learning experience. But if you want to start with codecs, you absolutely can. Um, there's no harm in that. But here, if you don't have knowledge of codecs or programming, you're looking at 12 hours, even more. Um, if you're experienced, you're going to be done this in four hours. It's not a long course. It doesn't take too long.

Uh, four to six hours. Uh, I mean, you should replicate it in your own environments and do your best. And so, you know, we're looking at I put 10 hours. 10 I don't think it's 10 hours. I would say like six hours here. I just forgot to update that. So, six hours would probably be on average study time here. 50% lecture labs, 50% practice exams, 1 hour a day for seven days if you need to go for seven days here. seven or one to two hours, you know, depends on how long it will take you, but generally one hour for seven days should get you there. Um, how are we going to pass? Well, there's lecture video content. Do those hands-on labs. I know they're going to be hard to

follow, but do your best. Um, and we have online paid practice exams, so you can go simulate it. And we always give you a free practice exam so that uh you know you can give that a try there. We are the test taker for this exam. So if you want it, you need to pay and sign up to do the final exam. Here it's showing 65 questions, but technically there's less questions than that. I just have a generic image in here since I don't feel like creating a screenshot every time

here. But when you go on the Exam Pro platform, there will be um a spot to do the final exam and you only have a few attempts, one attempt. Um and it's not intended to be hard. It's just there to validate your information. So it shouldn't be too difficult. So where can you do this? From the convenience of your own home on the exam pro platform. Uh and so you need to complete a percentage of the course in platform. Um if you if we do release this on free coamp and you're watching this on free coamp, some people will move over to the platform and then they'll pay for it.

And so I just want you to know that uh we'll probably set the bar at like 55%age of course completion and then you need to pass the exam. Some people come over and uh you know they've already watched some of it on free co camp and so we've tried to relax that so it's a bit easier for people to go and do their final exam. Uh there's fewer domains here as there's just fewer things going on here but we have core concepts and foundations codec services and features and advanced capabilities. Each domain is weighted um uh differently and so it'll determine how many questions you get. Uh the passing grade is 700 out of a thousand points. So you need around 70% to pass. It's using scaled scoring.

That means if you get exactly 70, you could still fail. Aim to go a bit higher. Okay, there are 50 questions. You can afford to get 15 questions wrong. There are no penalties. The format questions are multiple choice. Uh and that's it. Okay. The duration of the exam is 1 hour. You get 1.2 minutes per question. The exam time is 60 minutes. The C time is 90 minutes. Um and the C time refers to the amount uh that you should allocate for the exam. It includes time to review instructions, show online proctored your workspace, read and accept the NDA and then you can

go ahead complete the exam and provide feedback at the end. Uh this uh cert is valid for 24 months for two years before reertification. Um if you are bought into this course on the exam platform, I'm updating it as features come out. Uh so if you want to continuously be learning and keeping your skills up to date with codeex be on that platform. Um so it's up to you. I want to make it clear that getting certifications cloud AI certifications expect you to have technical skills and a bunch of other stuff. If you do not have these skills AI is not going to help you out. You still need to learn them. Free co is a great resource. and exam pro we have a subscription supporter package where we

have additional materials uh to lay down that foundational content that's going to really help you obviously the from zero course will give you a lot of that foundational knowledge if you are missing it and fill in those gaps uh but certification serve as a such a way uh to set a goalpost to get you to the end okay you still got to do uh a bit of everything all right so there we go and we are ready to jump into it. Hey everyone, it's Andrew. I just wanted to tell you that the repository um that you're going to see throughout this course and all the code that I'm using that I pushed uh to the uh to GitHub is here at Exam Pro codeex

examples. I generally follow this pattern for my courses. I'll have you know cloud code examples, adabus examples, Azure examples, whatever. And so if you're ever looking for that stuff, that is where it is. Um, but hopefully you can utilize this stuff. For the most part, I would hope that you're doing everything from scratch just as I am. Um, but you know, if you want to try to get similar results and you want to reference my code, it is over here. You can absolutely use it. Um, but there you go. Okay. What is codeex? Well, it is an agentic coding tool or you could also say a coding harness. Um, it can read your codebase, edit files, run commands,

integrates with your development tools. You can find it over at chatgpt.com/codex. It's available in many different surfaces uh such as your terminal, IDE, desktop, app, browser, and more. But the one that you are going to see the most is uh it looking like this, which is in the CLI. And it use case is generally for coding operations. It's not limited to that, but the idea here is that it can automate your workflow, build features, and fix bugs in your codebase, create commits, uh, deal with pull requests, connect to MT NCP tools, customize instructions, skills, and hooks. Skills is interesting because they adopt the agent skill framework. So, if you have skills over in Cloud

Code, you can bring them over to Codeex. I'm not sure if Gemini CLI has them, but if they do, that'd be really, really nice. uh you can uh build custom agents. I have uh run agent teams but right now at this time this is not possible. I just kind of predict that maybe they will try to offer this um as cloud code has it or maybe there's a um third-party solution that we can utilize. So if we can I'm going to try to do that in this course. If we can't then you know that I wasn't able to do and that's why it's not in here. Uh we can pipe things to codeex. You can uh create scripts, automate things with our CLI and it can

work from anywhere as it's saying that's available in many different surfaces. So, but what is Codex? Codex is a CLI program and so by nature it runs in your terminal. Uh Codex is specialized to write code, but it can also write docs, run builds, search files, research topics and more. Um and Codex uses an agentic loop which is very important. It has multiple models and it has tool calls. Um, and so now that we know that, the thing that we really wanted to take a look at here is the agentic loop as every provider has their own agentic loop and they're different and it's nice to know how it's working.

Okay. So, let's take a look at the agentic loop for codeex. Well, the key difference between something like cloud code and uh codeex is that we just don't know with codeex because they do not tell you how the internal mechanism works. So with uh cloud code they define very three specific steps and codeex could be using those specific steps but we just don't know but what we do know is that it takes in a prompt which will invoke uh the um agentic loop and in that mystery box which is model inference it will have some level of reasoning um and then it's going to do uh calls out to tools. Okay. And then it's going to return that back to the model. It's going to keep looping until

it gets to an end result and at any given time you can interrupt it. So that is their agentic loop. I wish I knew more. There are articles online that um Codex has in great detail and I've read them from OpenAI. Uh and it didn't tell me a whole lot. Okay. So we just don't know. Um and you know I would say claude code will or claude will tell us things and codeex or chatb2 will tell us things but it's hallucinating a lot and guessing which means to me that we just don't really know. Okay. So for codecs there are different models we can use specifically open AAI models and uh and they have their own family of models called GPT. Um, and

based on what models you use, you're going to get different speed, cost, and intelligence. Uh, if we were to uh open up codecs either in the desktop app or the CLI, we can get a list of possible models. At first, it looks like there's more than what Cloud Code has, but basically it's very similar. They have a uh like a in the middle, your goal Goldilocks model in the middle, right, which is just the default one or doesn't have a particular name. Um then there's mini which is just like haiku I suppose and then there is max which would probably be like opus. Um but notice that we have very specific codeex ones. So these ones are going to be specialized for working with codecs and

so obviously um coding but uh we will have to figure it out as to like why would they give us access to you know this one and these ones? Why wouldn't we just be using the codeex one? So that's something we should try to answer. Um and they always have these charts. While they used to always have these charts, uh both uh Claude and OpenAI seem to not be making them anymore, but what they would do is give you a comparison of where they would sit. So, you would see, you know, GPT4 uh over here, and then it's mini version over here, and then Nano down below.

Notice we don't have Nano over here. Notice there is no 4 in our list anymore. And this obviously is an old chart because they're not making them anymore, but I still wanted to show it just so you get an idea of where things could sit. Um, OpenIA does have model cards uh with good amount of detail of cost, usage, reasoning, stuff like that. So, that's really cool and that's something we're going to take a look at. Okay, so here we are on the OpenAI developers website. I just typed in OpenAI developers models um to find it here in Google and you can see we have a bunch of models. You have to be a bit careful because there could be models here that

are not accessible in codec. So for instance, we see GPT54 nano. I'm not sure if that's actually available to us, but if it is, we can come back here and take a look. There's also models that are not specific to codecs. So we have image generation, transcription, other things. So we have to know what is it that we're looking for. So if it's 5.4, which is what is currently out, this video might be dated and so there might be newer models. But here you can see highest reasoning medium speed um input $2.5 to $15 over what time is very needs to be very clear because uh that sounds really expensive but that's probably not the real cost right like it says this here but like at what volume so this

says per 1 million right so that's a lot of tokens so that's actually you know pretty good um there's also batch API pricing so that means that you do not need a result instantly and so you can wait and so that would be cheaper but notice down below here compare. So 5.4 is more expensive than 5.2. Uh and then mini is uh way cheaper. And here showing uh its availability, right? Um and then we have rate limits down below. Let's go back a step here. I'm curious if we can see the codeex models in here. So we'll go view all.

Okay. And what I'm looking for is codecs. Can we see codeex coding models? Here we go. So we have 535. We'll go over to here. Okay. So, we have Codex is optimized for gentic tasks in CEX or similar environments. Uh supports low, medium, high X high reasoning efforts. So, when it's talking about efforts and I'll have to make a separate slide on this, but effort determines how large the context window is, right? So, um that's another thing that we will probably want to adjust. Um but that is separate from the actual underlying model intelligence, right?

Okay. You can see its cost in comparison here. Pretty good uh comparison to the GPT54. Like this one's I think it's cheaper. Um but anyway, you know, you can see that information here. Let's go over to the max. So here, what I'm looking for here is I want to know is it just super intelligent or is it just a larger context window? So I'll open this tab here. We'll go over this one here. And so what I'm looking for here is maybe its context window size. So over here it says context window size 400,000. Over here this one's 400,000. So the max is not a larger context

window. And so it must be more intelligent um like opus I suppose. But here says for longunning tasks optimized for longunning tasks. Um and so would use claude. They say like opus has more intelligence. This is not saying it has more intelligence. This is saying that it can it's better for longr running tasks. We have one two three four reasoning here. Four reasoning. So this has the same thing. So we cannot just apply our knowledge from claude over to this. Um and that's why we have to check these things because we never really know for certain. Also notice that there is a max output tokens and then here we have the context window. And so um uh that's really interesting. So, I

wonder how that works if you have a lot of stuff coming in and going out. Um, but when you use claw, they just they don't show you both. They just show you one like what the context window is. Um, so it seems like it's larger here because this is 400,000 and uh claude models at this time are at least 200,000 and that would be input output. But here they're saying like okay, the context is 400,000 but we're limiting the max output. Um, so it's not necessarily larger than claude in that sense. The output is probably the same, right? 100,000. Um, so that's kind of interesting.

Is there anything else here that we should take a look at? I don't think so. But the one thing I wanted to see was there like a nano. There is no nano. Okay. But yeah, those are the models. And when we want to know more about them, we can come back and take a look here. Okay. Okay. Okay, so now that we know what Codex is, how do we get it installed? Well, it's pretty simple. It's a single line install though you do need to have Noode.js installed on your system. Um, recommended to use WSL 2, a Mac or

Linux. Um, Windows side, could you do it? Maybe. I don't think I would try to install it on the Windows side of Windows. I would install in WSL because otherwise you're going to probably run into other problems as coding environments are set up for Linux. Um so we will go ahead and do that. What's interesting is that Claude used to have this approach as a global node package but they moved away to a um isolate binary which made it so you didn't need Node.js but Codeex still is relying on Node.js to deliver their CLI program. Um to launch it very straightforward once we have it installed we type in Codex and it launches. Okay we log in with the login command. Um, so you would type in

codeex and then type for/login. You could probably also do like codeex login but I like to launch it from there. We have three options. Looks just like um claude. Okay. Uh there is with the subscription which is the easiest way. Uh we have device code which is with a one-time code. I don't think we're going to show that but you know it's nice that you can do that. And then we have API key. So that can from come from a firstparty provider or a third party. The only third party is Azure as OpenAI has a good relationship partnership with um Microsoft and so that's how we'll log in. Uh if we want to check if we are logged in or the status of our logged in we type in for/status. This is different

from um this is different from uh cloud code is cloud code status they have off status which just literally shows you the status of login but this status uh shows you more than just if you're logged in it shows your usage uh which account what mode you're in uh the directory of the project that is being loaded right now uh what is the current model so we're getting more information but for our session we have the session thing so it's showing us how we are logged in um and then over here if we were utilizing um the API key it would show here API config configured to run with codeex right and then over here it's showing the account that you're utilizing so here we can tell it's a

subscription so this one's a subscription okay and then this one here is with an API key to log out you type for/log pretty straightforward and it will show you that you are logged out um but there you go Okay. Hey folks, this is Andrew and in this video we are going to look at installing codec. So I'm going to go over and type in codeex chatgpt into Google. I'm just doing this offscreen and we will drag this on over here. Okay. And I already have a um chatbt subscription. So, you know, if you want to proceed forward here, you need a subscription and a developer API account. Okay, we'll generally show you where they are as we work through this. But I've typed in

Codex Chatbt and what I'm looking for here uh is that option. So, here we have chatbtdex. That's probably the site we're looking for. Okay. And here we want to get started with it. So, we can download the ID. We're going to do that later. I'm focused on the CLI as um this course is optimized for developers. I'm a developer. I hope you are a developer too. If you're not, you will learn along with us, which is fine. Uh, and so I already have Node installed, but I'll go ahead and hit enter. If you do not have Node, right, how to install NodeJS, WSL, okay, to look it up. There's instructions. If you don't have WSL, look up how to install WSL. I'm not covering them in this course as that

would require a lot of work. I probably have videos somewhere recorded for this stuff. um search my back uh back catalog and try to find it. Okay. Anyway, so we ran that line and now we should have codeex installed. I'm going to type in codeex. Okay, it's going to launch up now. I already have this installed, so it might be a little bit different, but right away it's trying to say you are in this directory. Do you want to trust this contents? I'm going to say no because I

wasn't ready with a folder yet. I'm going to say I'm going to make a new folder called codeex hello world. Actually, I'm going to make a new repo here called Codex uh Codex examples. Okay. And uh in here, I'm going to make a new folder called Codex Hello World as we need to make our first hello world of course. Okay. And now I'm going to type in codeex. So when you do this, I'm assuming that it's initializing a project somewhere. This is something that Claude does and I feel that uh Codex probably does that as well. We will find out as we work through this.

Um it says, do you trust the content? I'm going to say yes. And so now we are in codeex. So um we might not necessarily be authenticate. I'm going to type in status and I am logged in. So I'm going just log out here so you can get a similar experience. So now I'm logged out and I remember type codeex. Um and so we'll get Codex launched up here. And right away it's asking me how do I want to authenticate? Because it knows there's nothing here. I'm going to use chatbt. So I'm going to hit enter and it's going to give me a link. It's going to try to open this thing up. It doesn't work good in WSL 2. So, I have to copy this directly. If you are on a Mac with this, it'll be a lot

easier. So, I'm over here and I authenticate with my Google. So, I'm going to go over here to my Google. We are going to get in there and I'll hit continue. Okay. And now we are authenticated. So, it says, you know, you are signed in. You can close this now. I go ahead and hit enter. I'm now logged in and I want to set up our most basic project here. So, I'm gonna just say a new file. Whoops. Uh, actually, I'll just tell like, can you make me a file uh hello worldrb that prints hello world? Okay, so I'm going to go ahead and ask it. I'm on GPT.5.4.

We'll talk about how to switch stuff, do that stuff later. Right now, we are doing the most basic thing. Okay. So, I'm just generating out that hello world. And I don't see it here. Oh, you know why? Anytime you have a new folder, I got to do code period. I got to open that context here. It's just reopening that there. I'm not sure why it's downloading, but there we go. And so, now we can see that it generated our hello world. Look at that. Okay. Um, and so we have now successfully installed it, authenticated. Again, you can check your status, right? You go to codeex here.

You can check your status. If you're logged in by going here, you can see I have this information. Okay, I'm going to make that a little bit bigger. Um, and that's all we want to do. Just make a hello world file. That's about it. We'll say get addit commit hyphen m oops get init. I guess I might not have made that repo yet. But anyway, I'll push that to GitHub. Okay, this repo it'll be in my, you know, exampo, right? So like here I have one for claw.

I actually just pushed it. I was late doing it, but I will push it here to, you know, codeex examples here. Okay, chiao. Okay, let's take a look here at how we use codecs with the API key. So, Codex API key can be used to control costs and usage for codecs. It's useful for production use cases and automated systems. Um, or you know maybe you need more usage than the subscription can provide and so this is a way to go beyond your limits of your subscription. Um but the idea here is that you will generate out an API key and you will paste it into codeex um and

then you can also set it via an environment variable. This is probably what you would do if you were working um in a remote system setting it for automation. That would be the approach you'd have there. In order to add credits, you need to go to the platform.openai.com, have an account, sign up, and then once you're there, you would add to your balance. So, okay folks, let's go ahead and get API key access hooked up here. So, I'm going to make a new Actually, we don't even really need a new folder. We just need to go ahead and log out of codeex. So, we'll go here and I just want to make sure uh yeah, we can say yes on this. I just want to make sure that uh we are

indeed logged in. So, let's say go ahead and log out. And so, now I'm not logged into anything. Okay. And the next thing we need to do is go over platform. Uh was it chatgpt? So we're trying to find where it is. It's platform.openey.com. I really dislike Oh, it brought us right in here. But sometimes if you're not logged in, I'm going to see if we can see the actual main one. Let's go to openai.com because I just want to show you how confusing this can be. So you go

over to OpenAI. Okay. Think, okay, well, where is it? You know, developers. Great. Okay. And so you got to go AI platform here, right? And we're still not in which is so darn confusing. But then, you know, we can go here and click on start building. Go AI dash API dashboard. Now, you do have to have an account already. So, go ahead and create your account. Um, and then we do have stuff over here. What we're interested in is our API key. And here I have a test one. I'm going to go ahead and revoke my key. Okay. I do have an org. So if you're personal, it's going to be a little bit different. Okay, but I have an org and so it's pretty much the same. And I'm going to go ahead and add a key.

All right. And so here we're going to just say API key test. And you have projects. You always get a basic one. So I'll leave it in there. We have all restricted read only. I'm going with all because I need it to actually do something and work. There's our key. I'm going to go over to here. We will type in codeex. Whoops. Code codeex. I'm just going to go to the Hello World app. Type in codeex here. I'm going to sign in with my own API key. I'm going to paste it in. We'll hit enter. Okay. And I'm now going to ask it like, can you tell me the contents of this repo of this folder? Okay. And so I'm just asking a very basic question. It shouldn't take too long.

And so it's showing the contents there. And so it works, right? So that's one way. I'm going to go ahead and log out because the other way I want to see if it works. So, if we type in codeex, notice that it's prompting us. I'm going to hit I'm going to hit escape or control uh control shift. Oh, sorry. Controll C to get out of there. And the next way we want to do to set that API key. So, I believe it's export um I need my slide to go check. And it is wrong slide. Let me find the right one here. It's going to be open. Whoops. AI API key equals. And I'm going to paste that key in there. I'm just going to go ahead and copy it again. I just clicked the copy button here. Okay. I'm

going to hit paste. We'll go to the end. We'll hit uh whatever there. And now I'm going to type in codeex. It's still picking it up. So, it didn't pick it up. Provide your own API key. Oh, but there it is. So, I didn't set that, right? So it came in from the export. So we'll go ahead and hit enter. Okay. That's really interesting. All right. So I say you know what is in this repo. Okay. Give it a second there to tell us what's in that repo. Not a whole lot, right? And it's thinking. Yeah.

Minimal project, right? So I'm going to go ahead and hit status. Here we can see token usage 99% data not available collaboration mode API key. So I'm going to go ahead and say log out. I want to unset that key there because I want to prove that um it came in that way and the it wasn't just the open API thing. So that's open AI key. So we have unset open AI. It's very hard for me to remember that. Just trying to find the correct slides here where it has Oh, the image. Come on. Where are you? Slide. I'm looking off screen. Sorry about that.

There it is. Okay. So, OpenAI API. Whoops. API key. I'm going to make sure it's not there. So, I'm going to type in envaii. Making sure nothing is set. Okay, perfect. And I'm going to type in codeex again. I'm go down here. Notice it didn't fill it in. So if it's set, it will pick it up that way, right? In the export, if it's set, it would pick it up here. You still have to go through that interface. Not sure why they make you go through that interface, but I guess it's a good way to explicitly know which one you are utilizing. Um, but yeah, pretty

straightforward. The other thing we will do is just make sure we get rid of that key. So, I'm going to go over to here, where I have this key, right? I'm going to go ahead and delete that. Revoke it. There we go. And uh somewhere in here you can load up your credits. So I the reason I don't want to show that is because I don't want to end up um exposing any uh financial information. But somewhere in here you can load up load it up. And I feel like if you were to go here it probably would prompt you to go to the right place to buy. So again this is developer course.

I think you can find it folks. Um but there you go. Okay. Chowo chia. Okay, so let's take a look at subscriptions chatbt that interact with codecs. So basically, as far as I understand, if you have a subscription, you get access to codeex, but you really can't do much with free. Go is going to limit you from advanced reasoning models, which you're going to want. So you're going to want to pay that $25, okay? Whatever that is. This is American, so Canadian, it's more than that for me, but like I'm paying 30 Canadian for this and 30 Canadian for Claude. So that sounds about right. Um, in terms of usage, when you're using um, uh, Claude and Claude code, they share

the same usage, right, across everything. That's not how chatbt and OpenAI works. So uh, so the way it works with Chachbt and OpenAI codecs is they don't share usage. They have their own buckets. So you can freely use chatbt without it impacting your codeex usage and vice versa. Um in order to keep track the only way that I can figure it out uh for codecs at this time is status. So you type for/ status and you see a 5 hour limit and a weekly limit. This is the same as clouds. When we say it is a rolling window that means after the 5h hour period it will reset and you'll come back and you have a weekly rolling limit as well. Um for detailed breakdown we can find information in the

platform.AI openai uh.com/usage. Um and so here is that example where we can see usage. We can even see across users. I have a company account, right? So we can see that information, but your personal accounts are going to have something very similar. One thing that Codex does not have which or um which Claw does have is this extra uses thing. So this screenshot is from Claw. Okay. And the idea is you turn it on, you say I want to go over a certain amount and that's how it works. Um but the thing is that with um codeex if you run out of your chat GPT subscription um it's there's no easy

interface. You literally have to go buy API tokens and then add your API token key. Um so you can do that but it's just not a seamless like transition over. Okay so that's the only key difference there but there you go. Okay we are talking about context window size. This is the total working memory a model can hold at once and includes both its inputs and outputs measured tokens. So here I pulled GPT 5.3. I think 5.4 is out right now. Uh there might even be newer models, but this is the one I'm using as a reference. And notice it has 400,000 uh token context window, right? And so that is its total working memory

inputs and outputs. What's interesting is they'll show here you here the max output tokens. And so um this is where it's going to stop. Um, and so that doesn't necessarily mean that it will always be this. It's just whatever you're using, it's going to cap out here and it's going to let you have more input than output. Um, we do know that once you go over 200,000 uh context window in Claude, which is the competitor, we start to see bad results. And so, um, we see results like it's just not listening or it's hallucinating um, and just quality control issues. Um, and so I don't think the GPT models are any different. um the larger context you

can run to issues and so maybe uh Codeex is trying to manage uh those issues by capping out the output tokens to 128,000 and that kind of makes sense when you look at it like that where it's controlling that output whereas uh claude not so much it doesn't control that it's up to you get more finite control um but anyway the other thing I need to just point out is that when we talk about conversations imagine you started a conversation with a larger context window imagine there is a model that has a million context window tokens and if you and you were to switch it to a smaller one 400,000 you have to consider that you would lose that data right so um I'm not sure if

we'll run into this edge case here with codeex I don't usually try to run the $1 million uh context model so uh context window so it's it's less of an issue for me um but I just wanted to point that out there are two ways we can manage our context window we have clear and compact Okay, so clear is going to clear the conversation, right? So start from zero. Compact is going to manually trigger conversation history summarization. Now it automatically happens. So when you start to hit the limit of your context window, it's going to summarize your conversation so you can keep on going. That's why it feels like you can go forever with chatbt because it's constantly summarizing.

You're just not seeing it do that. But sometimes when it summarizes it loses key information. And so that's also the reason why you know you understand like why don't you remember everything because it has to do with how it selectively pulls information in. So if you have a lot of information it needs to live an external knowledge base that it can pull from. Okay. But anyway we have these two commands. So yeah codeex automanages context. It doesn't hit hard limits like claude. So claude will you have actually have a command cla code. You have a command called for/context.

It'll show you as it's filling up. You will never see this in codecs. It'll feel infinite until it degrades. And so you just have to have a sense of degragation and when to call these. Okay. So, we want to be able to manage our context window size. And so, what I'm going to do here is go into our hello world. I'm gonna go ahead and type in codeex. I got to go log back in uh to Tatypt as I am currently logged out. And also if it's been a while, you might have to just reauthenticate. It's not going to stay there forever. It's for your safety. And I'm just authenticating here off screen. Now on screen, obviously. And we'll go ahead and hit

continue. Get ourselves in there nice and comfortable. And we'll go ahead and hit enter. Okay, because now we're ready. And so, um, I want to see if there's a way for us to see our contact size. Like, is there any way? I don't even think there is. So, if we hit this tab here, and I'm just looking here. Resume fork plan compact. Can we see that? There's a status line. I wonder if we could update the status line to find out. But I'm going to go ahead here and I'm going to say, um, can you create me a tick tac toe game? I, you know what I'm going to

do? I'm going to just back out for a second. We'll make a new project. Tick tac toe. Tic-tac-toe. CD tic-tac-toe. And I'm going to reopen this here. And I'm going to go create me a tick tac toe game as a bash script um using bash scripting. Okay. And so we'll go ahead and run that and that's going to go ahead and create that. Now I didn't do code period so I can't exactly see what I'm doing. I should really back out one directory so we can see everything. I'll do that in a moment once this is done generating.

Yes, you are. It knows that it's empty, so it can do whatever it wants. And so, we are just waiting for it to figure out what it's doing. While that's going on, I'm just going to hit a new tab here. Oh, no. Let's try this again. Oh, didn't let me do that. That's fine. We'll wait a moment. And so, here is our code. And now it's running it to make sure that it works. It's telling that we can run it. I'm just going to quit out of here. Just going to back out of directory. Do code period on my top level repo here just so I have an easier time navigating. Now it's it's a non-issue early on here, but as our repo grows, this could be a problem. At least

with Claude, it would like to go up a directory and search it at stuff that I don't expect it to look at. But anyway, so here we have our tic-tac-toe. Okay, tic tac toe. We're going to go ahead and run it. Oh, but you know the thing that happened is that we left that conversation. So now we don't know what the context is. So, what I'm going to do is I'm going to type in codeex. If we ever get a conversation, we can just go ahead and type in resume. Okay. And if we do that, which is one minute ago, it'll it'll bring back that into context. So, I have a question like, can codeex tell me

how large my context window size is? One thing I think that it probably could do maybe update the status line and tell us status line is the thing. It's the thing down below here that shows here. I wonder if it could be updated to show us that. I think that would be interesting to see. So here like how would you even know the local environment doesn't expose the context size directly through enviring codecs locally. So the practical usable window size is this. Okay. But can we ever see the size? Is there any data we can expose to status line to see our current working context size window.

Okay. And this is where it's kind of frustrating because if you can't track it, it's a bit hard. Now, if we were programmatically working with it via the um their SDK, I'm sure we could probably get that information, but through this interface, we might not be able to get that information. And so, we're just looking, waiting, seeing what it says. What's interesting, it is searching some codeex file, but there's nothing in that. So, I'm actually a bit confused. Is this not a um uh I thought this was just an empty file, but maybe it's not. So here it

says your local codeex indic indicates active model. I'm checking whether codeex exposes a live token. So here suggesting this is an SQLite database. Okay. Um so we have it. There's no obvious status f uh information focusing that. So no, it's not going to tell us. That's interesting. One thing I'd like to do while I'm here, I do have SQLite installed. If you don't, well, I mean, you'd have to because that's how it's able to check it, but I want to go here and just open it up for a second. I'm not the best at uh can I split this here? There we go. But I'm just curious if I go into tic-tac-toe here. SQLite 3.codeex.

Uh I just take a look at some of the queries here. Show tables. Escoite is not something I remember the queries for. Can you create me a cheat sheet markdown file for SQLite commands I can run against my codeex uh file. Okay. And actually another thing that we could probably do there might be an extension for SQLite to view it. Esco light 3.

Um I mean this one's more popular. We don't need to edit any data. We just need to view it. So I'm going to go ahead here and try this now. Did that work? Go back over to here. Sometimes when that happens, you have to refresh. Do you want me to allow writing a markdown cheat file in this workspace? Yes, of course. I'm just trying to get a cheat sheet as well so that we can test it. There we go. And so in here we might have some examples. It seems like there's more stuff in here than just what we're expecting as it's going

to very specific things. I'm just confused because what is this? This is a binary. What is it? I'm literally I'm going to make the next video about like what it is because I'm not I'm very curious. I thought it was just an empty file, the configuration file. But anyway, I guess it must not be there must be this must be an archive. But the point is that we want to focus on um context size, right? And so we have no idea what our context size is. And so all we can do is type in compact. Okay. And that will summarize our working conversation. So, it really is a touch and feel kind of thing.

I want to stay uh on point here with the uh compact and clear. Okay. But anyway, so we're compacting the conversation. No visibility whatsoever. I don't like that. And we'll just wait till it comes back and reports to us. I was hoping with the database that maybe we could open it up and see if we could find the information ourselves, but there's obviously more to that file. So, now it's compacted. If we want to clear it out, we type in clear. That's basically like starting a new conversation. Here it says to uh token usage total input. So it's literally showing us the number,

but it didn't show us when we compacted it. Weird. So there is information. How did we get that active information? I have no idea. But hey, at least we tried. Okay, so that's your compact. That is your clear. All right. So, I did a bit more research into this codeex file as it was really confusing me and I mean it says this here but like with more research I figured out what it is. It is just an empty file and it's just there uh before we have the folder because there is a folder called.codeex but was really throwing me off was that there's this empty file and I thought maybe it wasn't a file it was a binary and it was

searching that SQLite stuff. Uh maybe there are escalite files down the road here as we will find out. But for now, I just wanted to uh clear that confusion. This is really strange. I don't know why they would do that. Um like why make a file that in the place of a folder that's not there, but I maybe it helps codeex load that area. I don't know. But anyway, that's what it does. Strange, right? Cha chia. Okay, let's talk about codeex sessions.

This is functionally the same as a conversation with your codeex agent. So when you start talking to it, that is a session. When you type the word codeex, you are starting a new session. Okay, every time you do that, uh you can use the for/ new to start a new session whenever you want. And I need to make it clear that when we say sessions, we can also mean threads, messages, conversation. These are interchangeable terms in the database. they call them a thread. In uh some interfaces it calls it a session. In other places it calls it a conversation. And at the very low level it really are messages. Um so just understand I'm going to keep mixing that term so you get used to it. But we're

all talking about sessions. That's going to be the top level word I want to use because other providers use that as their anchor word and I want to make it consistent so that you can move between codecs and other providers. Um, so obviously we can use the new command and talk to it and a new session is created every time we do that. Uh, we can also fork a session. So that's where you can resume a previous session without affecting the original session. Um, and so we have the for/fork command. It's different than cloud code because cloud code will allow you to choose any point in time similar to get forking, but this it's just copying it and we're continuing on the conversation from

there. There's no point in time history. Uh and I say you don't specify a session ID, but this is uh yeah, you don't for the fork command, right? But there is a session ID for sessions, and we'll talk about that here in just a moment. Okay. Um so we also have the resume and rename. We've already probably encountered resume. Resume allows you to resume a previous conversation. Okay. Um, and so if you were to quit, uh, like if you're if you made a new conversation, you were to quit, it would actually tell you what session ID you are using. So you can run this command again to re-enter the conversation. If

you resume a previous conversation and quit for some reason, it doesn't spit that out, which is kind of annoying, but we can get to the session ID. I'll talk about where it actually is stored. Um, rename basically gives this a conversation alias. So now you can do codeex resume whatever. for some reason at this time it does not show up in the um resume history. I'm not sure why they didn't do that, but it's not there. So, you have to kind of remember what it is. Or if you really want to get to it, um there is a uh an SQL light file in your uh codeex directory in the home directory and there's a table called threads, right? And if you get ID and title, there's the ID, there is that

title, and it does not store the rename in there. The rename is stored for some reason in a separate file called sessions index json uh jsonl and there it is. Um so yeah I mean it kind of feels like codeex was coded with using codeex because um these structures are very illogical to me but that's where they are and if we ever need to get that information we can find it that way. Okay. Hey folks, it's Andrew and we are going to take a look at that new command, that fork command. So right now I'm in that tick tic-tac-toe. Just going to quit out here so we are at the same level. I'm

going to go ahead and type in codeex, right? And um we can go ahead and type in resume to go to a previous conversation. So here is the one a while ago where we created the tic-tac-toe game. So I'm going to hit enter. And so what I can do is if I want to fork this conversation, I can go ahead and do that. And now it's forked. And I can say um you know can you add colors to the uh bash game, right? Because right now I don't believe that there's any colors. And so I'm just asking for a small graphical improvement and it should be able to figure something out. We'll give it a moment here to figure that out. If there's anything it needs to accept, we'll just

accept it, but it won't take too long. Okay. Um, and so we'll go ahead and we'll say yes. Don't ask me again for these files. Okay. So, um, we have that. What I'm curious is like if we go back to resume, right? We now have, uh, this one here and this one here. Um, is it easy to see what they are? Not necessarily because we can't name those sessions. Uh in clawed code you can name your sessions but I don't believe here that you can rename them. I'm going to go just double check rename.

Oh we can. Okay. Rename. And so we can call this one I wasn't sure but we'll call this um tick tac toe improved. Okay. And then we'll go back to presume. Uhhuh. But it doesn't show that it's like great, we renamed this thread, but like I don't see that reflected. I'm gonna go ahead and stop this. Gonna type in clear. Oh, look here. So now we can resume it based on it's information. Interesting. One second. I'm gonna make this a slide. Okay. All right. Something else that's really interesting is if I go here and type in codeex and we'll

just say hello, right? And then I quit out. One second here. I quit out. Notice it's outputting a session ID, which is interesting because before I thought maybe, you know, we didn't know what it is. Also notice we're getting our token usage not when we're currently in the active token uh in the uh interface, but we can see it here, which is kind of a weird way to uh work around that problem. Okay. Sorry, I keep pausing because I get good screenshots here and then I go back and record uh pre previous videos so they're better for you. Um, and so, you know, one thing I have in my mind is like, okay, we have this one with a session

ID. We have another one with a name. They must be storing this somewhere. And so, I would think that this would go into a home directory here, like codeex. Okay. And then in here, I'm looking for sessions for uh for Claude. It's um it's called projects. We'll go to sessions here 2026 04. I'm just trying to find where this information is. Today is probably 05. Okay. And then we have roll out. So I'm going to just cat this stuff here. Take a look. So here we do have information, but that's not really that helpful.

Uhuh. Okay. And so what I'm trying to figure out here is like where is this stuff stored as we also have these SQLite directories as well and we also have logs over here as well. Um and I'm just trying to correlate where that is because right now like I don't know where these session IDs are stored. So I'm going to go back over to here and we're just going to ask codeex. It's like where are where is codeex storing the session ids or names for conversation threads?

Um this must be in codeex. Is it in an SQL database? And the reason why I want to know this is that at least I'm not aware of there's no command that's going to list out all the session IDs and the session names. And so we can't get it that way. I want to know explicitly where it is. I have a feeling that it's in that folder um in the SQLite directory. Notice here that it's exploring locally, but there we go. That's where I want it to go. And so it's going into here. It's checking the logs. Okay. And it is querying. So select name SQL. So it's checking what all the names of the tables are here. So JSON L index named an SQL 8 threads table for broad thread metadata.

So it's using both plain files. Yeah. We know that. And so here we have the conversation thread metadata is here. Let's go take a look at this. So I'm just going to cat out that file. I mean that is the thread name. So that at least tells us that we can find the thread name in the sessions index.jsonl. That's interesting. Um this has threads tables per thread metadata. And then the other place I guess it must be in the state file. Okay. So the short answer is yes. There's SQL database but the thread names are stored in a plain JSON index file. All the actual conversations are stored in sessions, not only in the SQL light.

Okay. Um, and so, you know, we can go back to our cheat sheet here because I wasn't expecting us to do this right away, but maybe we can. And so here, yeah, see here, this is actually the perfect command. This is actually what I want. So, I'm going to go down to here. I already have SQLite 3 installed. So, you might have to install it, you know, pseudo app install, whatever, whatever. Or just look up Linux installite 3. though I'm pretty sure it comes pre-installed on WSLT with Ubuntu. And so here what I'm trying to do is see the contents. Um, and we're doing all which is not very helpful. So I'm going to go here.

Maybe there's one called ID. Ah, there we go. Okay. I wish it would just describe the table to me. We have list tables, show schema. This is actually what I need. So I'm going to grab this one here. And so we have an ID here. Source model title. So I have ID title. So I'm going to go back and hit up and we'll say ID title. Okay. And so there it is. So the idea is that we have an ID uh a title um and then the name in that other file. Very strange system, but I mean if we wanted to resume one of these conversations this way, that's how we are going to do

it, right? Um, but anyway, we are a little bit off track here. No, maybe we're not. Let me just check. I could have swore we had other commands that I wanted to run here. Um, I think we did the new command, right? And we did the fork command, we did the resume, and we did the rename. And now we know where the stuff is coming from. So, no, we covered everything. So, that's perfect. Um, but yeah, I'm going to go and you're going to already see this. I'm going to retroactively add this information so we explicitly know and we will figure it out from there.

Okay. Cha chia. So I want to revisit the idea of viewing token usage. I think I said in the context window size that we don't have that visibility within um codecs. I was wrong as I had to do more digging to find it and it was too hard for me to update the prior video. So I'm just correcting it now. Um, and so I noticed that when you would quit out of a session, it would actually output token usage, say total input and then cached output, right? Um, and so I'm like, okay, where is that coming from? And so I found out that they come from these roll up rollout files. Um, and you have to cross reference them with the um, SQLite databases and stuff like that. And so I've created a base

script in the follow along that follows this video. Um, I'm already have a script. I'm just going to work on it a bit. And so you don't need to do anything. just watch the video and see how it works and then run it on your own machine so you can get um output and uh then we can accurately track uh that usage. Okay. Hey folks, it's Andrew and in this video I want to uh try out this usage script. Now I just ask codeex like can you make one for me as right now it's very hard for us to see our current context window usage. I would think that would be information we could add to our status line. Um but uh what I'm going to

do here is I'm just going to cd out and you would have to download this repo. It's in uh the exampro. Um it's over here the pro codeex examples. I didn't even push the repo but it will be pushed but it'll be over here right. Um and so let's go ahead and see if we can execute the script. The only thing that it did need um that I did not have installed which I was actually surprised was ripret. or u yeah rip gp um which is heavily used um this by claude so I'm really surprised it wasn't there so we'll just put here like prerec okay and here and so the way this should work is that it's going to read um uh what files is it going to read let's take a look here it reads the rollout

file So go let's go take a look at how the script works. So we have the state fi file where we get the state database and then we have the session index. Okay. And if we scroll on down oh we have the usage and stuff there. That's nice. And so what I'm looking for is exactly where that information is. Um and so I believe it's pulling it from the rollout file which is not very clear from that bash script.

Okay. Um, but that's fine. I just want to see if this actually works and we can get our usage. So, I'm going to go back over to here to my readme. I'm going to go ahead and run the latest. And the idea is that it's going to look at the last thread that we had and check its usage. Um, and so I'm going to go ahead and run that. And here it says the thread. It has no thread name. What's the name of the thread? What model did it use? And then we have the context window down below. The only thing I don't like is it's not um common eliminated. So, I might just go ahead here and update it and just say um my uh script outputs the model context window,

you know, number, but it's hard to read uh because it's not deliminated. Also, it's not showing us total uh context window, you know, eg, you know, used versus total and um there's also and like also we don't know like and I don't know if we can find that information out but like there's also like what was outputed versus what is the full context window and so that's kind of a challenge that we have here. I don't again I don't know why this isn't just built into it but you know codeex keeps it light and we'll see if we can update my script here. We'll be back in just a moment.

You won't have to do any of this. It'll just you'll just run it. Okay. Um, but yeah, here, look at these files. So, it's pulling from like the rollup files. Just want to grab that here. And I'm just going to put some uh information, you know, so we can determine token usage based on rollout roll out. Did I said roll out? Rollout files. session roll files. We'll have to figure out what those are. I'll probably make a slide. You'll probably have already seen it before we looked at this. Um, and now it's just running the script itself.

Okay. So, I'm going to go back over to here. Says it's good now. And I want to go down here and run this. Oh, I'm not in the uh I'm not in the right directory. I'm going to just quit out here and run it here. And so now we have something way more readable. Um I'm not sure why it's saying 258, but that's what's coming up with. And so at the very least, we're kind of getting some sense of usage, but you can tell it's not as easy. It'd be really nice if um we could replicate that context command that uh Claude has because then we could keep track and be like, okay,

our usage is not the best, right? Or whatever. Uh but I would probably try to roll this into um because now that we have the script, I'd probably try to roll it into our status line and maybe we will try doing that, right? Because here we could pass it in exactly the current context and stuff like that and limit the information. Um, but for now I would say we're starting to get some information. Um, and it wasn't in this video, but we did observe uh the token usage. I think it's when we quit, right? So it would just say like hi like this and then we quit. We do get this information. And so it would be nice to like correlate and see if that information is the same. I guess we

could just run the last one, right? And so here we see total 2 950. And so that's why it's confusing. So I can go back over to here and copy this. I'm not going to save it, but I just want to paste this in here like this. And this like why is there a discrepancy? Right. So I noticed um codeex when quitting reported much lower usage than our codeex uh our report file for usage did. What's going on here? And I was hoping go to debug mode. And so I can just paste it in. And so now we have that information. I don't think it pasted right. We'll try that again. And I want to go back over to here.

Oops. I'm going just copy this whole block here. Our usage uh numbers for our report don't match what Codeex reports when quitting. See, if we can paste that in now. There we go. And so now I'm giving it more information. I think that's the right information, right? I'm going to go over here. I just don't trust it. Yeah. No, no, we gave it. Okay. And so it's off the races. I've confirmed a tiny repo with just the report scripts. So it the mismatch is likely in parsing the logic rather than the surrounding code. I'm reading the script. uh and the

reference JSONL again the raw rule already explains this best the script is reading input tokens which includes cache input while codeex quit summary separates input I'm I'm updating the report to display the split instead of the collapse of everything oh that sounds better okay and again we might be able to use the script in our status line assuming that information is not already available in our status line I had already asked separately in a separate video if I could get it and it was like no I can't do that for to uh which I was surprised. Um but uh again there's not much in the docs and um I wouldn't know how to go about inspecting. I guess we could just ask codeex like hey tell

me what data gets returned back in the status line um because maybe it's already there but we are just looking at a way to fully understand it. So here it says now produces this information. Okay and we'll just wait for it to update the script. All right and so it thinks that it has repaired the script. verify with the uh this file here. Sure. And so I'm just going to stop it here. And so now we have 2823 with cache and output. We'll go ahead and run the latest. And here we are now getting our total usage here. And now we're getting a lot better accurate information. One thing that I would prefer is if this script would

return back this JSON data. Now, this again, this data might be already available um in our status line, which I don't know, but we'll have to make that a separate video. And so, I'm going to go back to here. I know you're just watching me coding and you're just learning as we go here, but um I would just say, you know, uh the report is great. Um I can you add a JSON output mode so we can get the report back as JSON? And I'm just going to make sure it doesn't already have that mode. I'm pretty sure it doesn't. Um, the default mode will be text, which will print the current report format as is.

Okay. And so I'm going to update that one last time here because again I'm thinking that maybe we can use that in the status line, but we'll have to investigate the status line separately, which is that thing down here below. Okay. All right. So it says that it's applied the update and we have um our option here. So I'm going to go ahead and copy that, close it out, hit enter, and we get back information. Uh that's not very readable. So I'm going to go ahead and do jq. JQ's already installed on my machine. Uh, I thought that's all we have to do.

How do we output JQ? Or maybe it's um this and then we pipe it to JQ. That's probably it. JQ like that. Period. I don't know how I know that, but we'll go back to here. And we'll just take a look at what we have. I think it's because we aren't specifying what and it's returning everything. And so here probably I have to do hyphen latest like that. Okay. And so now what we can see is mode latest output JSON limit 10 threads. Um and so it's just giving us the last thread here because obviously multiples can be returned. And so now we have this information. Um I'm just going to go

ahead here and give it an example because sometimes what helps is that if you put um output examples in here then um it can be used later on. So output example. There we go. And so you know if we need to leverage this I'm not sure if we do but if we need to leverage this we will be able to let me save that file. What's wrong? Let me save it. Let me Oh I saved it in the wrong spot maybe. Hold on here. No, it says, "Oh, there's a newer file."

Okay, hold on here. Save it. Yeah. Uh, overwrite. It's fine. We will go overwrite that one. The only thing that's missing here is that JSON option. So, I just got to write it back in here. I guess what it was saying was already open and it just didn't refresh it. So, I lost that information. So, we have output JSON. There we go. Like that. Um, and now we have update usage a way to keep track of our usage. Okay, but we'll see if we there's a better way.

Let us take a look at status line. So, you can choose what appears under your uh your prompt. Okay, so we have GPT 4.5. how much usage is left, uh the weekly usage, the context window, um how much we're using in and out. Okay. And we actually created a script because I wasn't sure if we could get that information. Um but eventually I just tried it out here and it was very obvious that you could just checkbox these things. Um the only thing is that I thought that this was going to work like clog with a bash script and that there would be a session JSON. And so as I'm doing the lab that comes up here, I'm basically exploring and trying to understand how this thing works. Uh and

I didn't re-record it because I thought it was interesting to see where all the data sources were coming from. Um but uh anyway, the point is we can check box it off in our other script we do not even need. But it's nice that we have created it and that we can work with it. Um because maybe there is something else we can do with our script if we ever needed to uh like keep track of our usage. Um so I you know we have that. Okay. Cha chia. In this video, we are going to explore status line as that will allow us to have more information that we would like to know. One thing that's hard is finding anywhere in the documentation of

what data is returned back in the status line. So, I figured we will use codecs. Um, and we can do this anywhere. I'm in the usage script, but uh we'll use codecs to see if we can determine what is in the status line. So codeex can return can update uh the status line, but I don't know. I'm going to just go back a second here. I'm going to make a new folder here. Call it status line. Okay. And we'll cd into that directory. And I'm just going to say codeex lets us update the status line. I assume that it uses um session JSON data, but I don't know its schema.

Is there a way you can probe and return the actual JSON data or how it collects and put that in a markdown file so we can see right and so I'm going to go task it with that because I assume that's what it's doing somewhere it's reading something um because that's how claude works. Right. And also Claude has like ash file that gets stored in a very particular location. There is a for/comand for us to update our status line which will just probably write a bash script. But let's see what it comes back with. So it's trying to figure it out. So I'm tracing the status line project.

I found a local codeex state in logs which is enough. Okay, hold on. Stop. Okay. Do you understand that I am talking about codeex status line feature and I just happen to have a folder called status line because I just want to make sure it understands that context because it might not I might have confused it. Not this workspace folder. The folder name sent. Yeah. So there we go. That confused it. So now it's gonna go ahead here. Uhhuh. And so it's going to go to the web and try to find something about status line. I can do that as well, right? I can say uh codec status line. And um yeah, I'll just show you here. Like people go here and they're like, "Yeah,

GitHub." You know what I'm saying? Like I'm not trying to make it feel like I'm lazy here, but like literally I cannot find information about the status line, but um you know, sometimes it does a search and figures it out. Yeah, see it's like I can't find it. Still can't find it. We'll give it a moment here. Maybe it'll go in the code and take a look. Look, it's like literally looking at schema information now. Mhm. So the public config does uh docs only show this. Uhhuh. Yeah, it's not documented. So we're just waiting here and seeing what it can do.

Literally, it's going to code and checking. That's sad. They really got to update their docs. People would use this stuff more if the docs are out there. We'll give it a moment. Okay, it used a lot of consumption there just to um do that for us here. I'm going to give this a nice refresh as we do have a status line and we're just looking for notes. Okay, so the short answer, I did not find evidence of custom status line script that hooks into the thing. What I did find was this um so they have this reference here.

You know, sometimes this stuff is hard like they have the feature but it's not easy to find. So here we are looking We have the configtol. It's not going to tell us anything. That's not helpful. We'll go back over to here. So, the codeex line appears to be an internal renderer of the name built-in items not documented. Uh, run my script. So, from an install also found this uh and now we're getting some output. Hold on here. The stat line appears to be assembled from multiple internal sources, not a single session JSON. Okay, so the relay out JSON contains structured metadata like this likely mappings persistent thread state by SQL light.

Okay. And here we have live token rate limit events from rollout JSON. So the rollout JSON contains live token information. Uh okay. So the part that's confusing, right, is like where is the status line getting updated? So if I go here, right, and I write status line configure where items appear. What if I hit enter? We have model with reasoning. Oh, is it just options? Is that all we get? Oh, I was expecting this to be a script. So this is the problem. You know, you got to explore and find out. So here we have total context window size.

Use tokens, input tokens, output tokens. Wow, there we go. Okay, interesting. But hey, at least we can get that information there. I actually kind of like this because it's just uh readily available. Session ID. Um I don't care about Codex. Weekly limit. How do I save this? Enter to confirm. And so now we have more information. Now the only thing is like we can't see all of our information. So, that is a little bit silly. Like, what's the point if we can't see it all? So, I'm going to go back over to here status line and I just take current directory off because oops, that's taking up too much room, right?

Current directory. Enter. There we go. And so now we can see our input or output. That is way better. Okay. So, I think that is what we were looking for. Um, but uh, yeah, I thought it would be a bash script. That's what I was expecting. I thought it was going to be JSON because that's what claude does, but it doesn't. Still, it works. Um, and I guess our user script isn't necessary, but now we know where the data is coming from. We understand how it's assembly. There's not a single session JSON data. Um, it's assembly on the fly, and it's it's not something we can configure.

Okay, but there we go. Cha chia. So the agents markdown file is a file that sits in your repository or few other places and the idea is that um codeex will passively read this file and that will inform it how to think about your project. Um and so there is one in the agents.override.mmd in your global scope uh that's in your home folder notice um the tilda uh codec. So it's going to be there. Then each project will have an agent.mmd file and you can also have ones in subdirectories uh and they will be joined as it walks through your codebase. So if encounters agents mds it's going to read them as well and append them. Um and there is an init

command that we can use to initialize and create our initial agents MD file if you not sure exactly what to write. Um a good agents MD covers repo layout important directories how to run a project build test and commands engineering conventions PR exceptions constraints do not do rules and how to verify work I don't know what it means by what does mean but I got this is straight from the doc so that's what they wrote but anyway um there are some things we need to know like the first thing is that it can only read the uh 32 kilobytes and after that it will truncate um claude is different where it's the first 200 lines and uh anything after that they don't want you to do it

and they want to push it into imported statements which by the way agents MD does not have an import shortcut um Gemini CLI does um cloud code does uh other ones do for some reason codeex does not and so if we want it to read other files I like putting things in a docs folder you have to say like you must read the following files the only disadvantage to this is like what if it decides not to read it okay and so sometimes I've had to do things like you must read these files in the bottom like are you sure you read those files you know So, um, you know, it's not foolproof, but we do get 32 kilobytes. And as we were working through this, I wasn't aware, um, that we did not have

this import statement. And so, uh, you'll notice that we explored through this and we find out that like this stuff's not documented. We're only finding like this information through GitHub issues of people talking about it. It's like this 32 kilobyte, not in the docs. Okay? So, I'm going back and forth here. like I'll record a lecture and then do a lab and then I'll find, oh, it's not really true and I'll go back to this and back and forth. And so that's why you'll notice that I have some inconsistencies here, but I do my best here to repair them. And I try to preserve um uh where I'm finding this stuff because that's the interesting part. Okay. But yeah, we'll continue on.

Okay. So, in this video, we want to explore the agents.md file. I know we made a tic-tac-toe game. So, I'm going to go over to that one. And in here, we are going to create a new agent MD. Actually, we don't have one. And so, what we'll use is the for/init command to generate one out. And so, what it's going to do is read that repo. And uh when it's done, it's going to generate one out. So, there it is. And we'll give it a moment. Okay. So, we should now have an agent. MD. We'll take a look and see what it looks like. So, I'm just going to split over here. So, project structure, modular organization, build, test, uh, development commands, coding

style, naming conventions, testing guidelines, commit and pull request guidelines. Um, and so we have information there. I would just now tell like can you break up the agents MD into a docs directory and then just reference uh with uh reference the um the list of files in the agents MD. knows where to look. Okay. And I mean basically every heading in the agents MD should be its own file. Name the markdown files in the docs directory snake case. Okay. So, we'll let it go ahead and do that. And I think that will get us a little bit better result. We'll wait a moment here for it to do that.

Okay, we are back. It didn't listen to me. I said snake case. This is not snake case. Snake case is hyphenated, but that's fine. It's still okay. It's consistent. And so, we're going to go to our agents.mmd and I'm going to go over to here and I'll go over to here. And we have this. I feel like we don't need um that the file is the entry point for the contributor's guide. each topic below is a docs. Um, keep contributor docs in the docs settings. I don't need that. So,

we'll take that out. And I'm going to take this out. We'll say project guidelines. this contains all the guidelines for our project. Okay. And so the thing is that there might be something like coding style naming convention. And so there's something we might want to trigger, right? Which is um I'm going to go to the top part here. And I want to add like a new thing which is like uh always speak like a caveman. Uh eg me do good. Okay. And so now we have that. And so I'm just going to see if that will take effect that first line. And I'm going to go ahead and just say um can you tell me uh what this app uh repo does?

And so what I'm hoping is it'll read that agents.mmd file. And I'm not sure if it picked up the agents.mmd whatsoever because we're in the tic tactoe, right? Did you read the tic-tac-toe? Like I'm gonna just I'm gonna stop it. I'm going to try this again because maybe it wasn't aware of it. So, we'll try this one more time. What is in this repo, this folder me find very small folder. There we go. And so now it's listening.

Okay. And I'm going to go ahead here and I'm just going to because I want to make sure that it can read uh the stuff here. And so we have styling like we'll just say like communication standards. We probably don't even need uh the um name of like we could probably just have like MD and stuff like this because I feel like it would know that they're here. You know what I mean? Um, and that would just save us more tokens for reading. So, I'm going to go here. Okay. And this is just to help it know what there is. And that's why we also wanted them to be named exactly what they're supposed to be,

okay? Because just consider that's all consuming your context, right? So, if it's pretty straightforward, we don't need that. Okay. Because we're trying to read we're trying to write this agent MD for the agents ID. So now it should know that and so just go here docs communication standards. Okay. And I'm going to go over here and we'll make a new file in here. Communication standards.

I say um I'm trying to think of like a persona. Talk like a V uh a person from the Victorian period. When a user asks a question, talk like a person from. So when talking to the user talk in the style of a person from the Victorian period. It' probably be better to give it an example so it knows, but I think it can figure that out. So I'm going to just stop this. I don't know if I really need to do that, but we'll go ahead and say, can you tell me what is in this folder? I just want to make sure that it's still loading that information. I might not always pick it up. Like the thing with these files is that they're guidance, right? And if it just chooses not to read it, then we won't know.

Did you open any of these files to understand context? No, I only listed the directory context and confirmed the guideline files from the file system. I did not open and read the context files. And so there's another thing called rules. And so maybe rules would be a better place for it, but I was hoping that it would read into them. Okay. So like Claude will Claude MD will follow through and reference files uh reference uh and inject and import files referenced. Does agents MD do this? So it says in the session agents MD told me the project guidelines are in the files but I do not automatically import them. I receive agents MD from your then

uh told me the guidelines. I still need to explicitly open each file. So if claude is automatically pulling them uh sure. So just give me a second. Be back in a moment. Okay. And so it looks like people are reporting the same problem because if you've used other ones like Claude or Gemini or other stuff they support referencing files through an at sign and I didn't put that in there, but here it's saying that doesn't exist, right? And so there's one project where it says, "Oh, I made this thing that you can do this for, but uh I don't really like the name FAF." Um but here is it coming from 666. That's the same one here. And so they still don't have support for this

unless there is docs here that says that we can do it. Let's see here. Reference what's the structured source reference a single file. Use that to include files directory without waiting for claude to read them. So what I'm going to do is go back over to here because we didn't put an at sign here, right? So, we'll go here and do this, this, this, this, this, this. And we'll try this again. Type in clear. Type in codeex. Tell me the contents. What? Tell me what the program in this folder does. And so what I'm trying to do is get it talk Victorian to me. So here it's explored the agent MD. This repo is small.

Okay. Coming back over here to quickly include files directories without waiting for claude to read them. But this is talking about like maybe like when you are talking to it in line, right? Which is not the same thing that we're talking about. I'm talking about like referencing within a markdown file. Let me just check this completely seeing if I'm missing something. Also, I just realized I'm a dummy because we're looking at the cloud code documentation. I thought when we clicked through here, sorry, I thought when we clicked through here, we were looking at that like we were looking at codeex, but we're not. And it was clear that there was something wrong because I'm like Codex

doesn't have good documentation whatsoever. So uh that should have been clear here. But uh yeah it's crazy this feature does not exist. Um the other thing is there is codeex rules but usually rules are things that need to be enforced. So give me a second let me take a look. Yeah but rules aren't the same thing. I forgot they are for sandboxes. Hm. So, you know, this is a problem because I think that we should be able to uh load in files dynamically. So, maybe we have to look at that FAF thing or something else. Give me just a moment. Okay. Also, now we know where it

truncates at. So, here it's saying it simply will not read anything past 32 kilobytes. Um, and so we didn't know like I already shot the um the slide. I'll have to go update it now. But now we actually know the limit. Again, this is not information in the docs. We have to go digging for this. Uh, let me go update that and I'll have to re-record the lecture. One second. So, this person is saying just say you must read the following files. Well, that's not the best, but I suppose we can give that a go and try.

Well, let's just try this first, okay? Because this might not give adherence, but we will try. And so, we're just going to go here. I'm just matching what that person did. And so, we'll we'll close that out here. We'll try that again. Tell what I can't believe I'm not even making sense here. So what program in this folder what program do we have like what is it? We'll go ahead here and so here should re read the agents.mmd. we'll give it a moment here. And now it's explored those other files because we told it to, right? Well, I guess technically it's exploring, right? So, here's I've examined the project instructions and shall now inspect the

directory contents and the principal script to determine precisely what uh the program resides here and what purpose it serves. My good fellow, the folder contains a small terminal game. So, it is a self-contained bash script. So, there we go. So, that's the way that we did it. The question is like is this a good way to get adherence because what if it doesn't decide to read all those files but I mean that's no different than if we had this as 32 kilobytes right um the other thing is like how would we know how large the file is so I would say that I would probably want a tool that would check the size of a file um but how would we know if we don't have a

warning and 32 kilobytes is a lot so I'm not sure I'm not exactly sure what we would do for that. It wouldn't be hard to make a script to check, but um how would we get a warning of that, especially if we can't update our status line for that? So, I think that's a bit unfortunate, but um I guess we'll create a script. Why not? So, we'll go ahead here and we'll say new folder, and this will be um uh agents size script. And I'll go back here, cd agent sidescript. Uh, and I'm gonna go here and say, create me a bash script that will read the agents MD and all of its um, doc files, all of its markdown files. in docs in a specific folder.

Okay, we are trying to determine if it's 32 kilobytes or smaller. So output the size and the size right the size the total and if we are in bounds um and return JSON okay so we'll go ahead and do that and then we'll test that because I would like to structure it this way I feel like that is a cleaner way to do it maybe it's not good in the long term but you know we'll work with it okay all right so now we have a script and here we and check it against this way. So, I'm just going to go ahead and if it doesn't uh have a read me here, go ahead and do this.

Okay. Um, and so I mean, it'd be better to run it the other way where it's like not from this folder, but that's fine. We'll do it this way. And so, that comes back with data. It's not really easy for me to read. So, I'm going to go ahead and say jq. Okay. Um and so here we have it limit bytes limit file counts it's counting all the files and the sizes of them. So total bytes is 2894 total K limit is 283 within the limit. I mean it's not very easy for us to read but the point is that we're very far from it and I think 32 kilobytes would take forever to get to. So probably we don't really have to worry about it um in the short term, but

it's nice to have the script in case we want it for later. So say Okay. And I'll see you in the next one. Okay. Chow chia. Let's talk about plan mode. And yes, I see those dots. I don't know where they came from. I can't get rid of them, but they're right here, right? So, those are the two mystery dots. But anyway, so um plan mode lets Codeex gather context, asking clarifying questions and building a stronger plan before implementation. So, the idea here, sorry, this is bothering me. I got to give it a smile face. But anyway, you use the plan mode

and uh so here I'm saying to create a Wolfenstein 3D clone game. And what it'll do is actually ask you a series of questions, which to me is really interesting. um as when you use um cloud code it doesn't ask you questions where this one it really tries to guide you along the entire process and then the end it will create uh the plan and then it can execute that plan um and actually I was very curious to see if it could do Wolfenstein 3D but that's a lot of work so maybe um I need to keep my scope a bit smaller we can still try to do Wolfenstein as an example but let's try a plan mode Okay.

All right. Let's take a look at plan mode. And I figured it'd be fun to try to make a Wolfenstein clone. So, I'm going to make a new folder here. Uh, as soon as my uh terminal decides to be responsive. I'm not sure why it's not responsive right now. There we go. Type in clear. And we'll say Wolfenstein 3D clone. Okay. And obviously it's not going to be as good as the real Wolfenstein, but we'll try it. So I'm going to hit shift tab. And now we're in plan mode. Notice we have two modes, default and shift tab. And hit it again. There we go. Plan mode. So create a Wolfenstein 3D clone. And perfect because I just uh went over my usage or rolling window.

So, we're back at 100%. So, that was a 60%. So, that's that's good. Um, and so my expectation is going to ask me questions, right? So, here it is thinking. Okay. And so, I'm waiting for it to prompt questions. So, web browser, of course. Yeah. Playable core, I would say. So, yes. Original inspired custom, so avoid direct reuse of copyright assets. I mean, I don't care if it uses it. Um, say minable. Minable art for now. Programmer art. I've never heard that term before. Programmer art. That's great. That just means bad art. And um, we'll use I mean, I do like Phaser. I knew Phaser pretty well. And

3GS is pretty good as well. Um, but I've never seen it able to um, utilize TypeScript is a good choice. Yep. TypeScript. But I've never seen it able to use um canvas or uh things like that to produce it. I do feel that it probably would require like WebGL, right? Um and so it hasn't asked me about that yet because it said it would make it vanilla with vanilla JavaScript. But if we do have the option for WebGL, I'll probably choose it. Um we'll say no stay in plan mode. So uh you know say go here and is it using WebGL canvas? Oh it didn't ask me. So the

first time I did this it would ask me exactly what it was but I think canvas would be better you know. I think or sorry yeah I think webgl would be better as canvas may be limited. Okay. So, we'll go ahead and ask it to do that. Now, the other thing is like does it know that it's going to follow the actual way that Wolfenstein 3D works? Probably not, but we will uh let it go off to the races here and see what it comes back with. Um, it is a lot better. And so, we have true WebGL. So, use textured uh sprites uh in a real 3D scene for better scalability. Click the keep the classic raycasting gameplay model, but use

WebGL. And that's what I want to do is that raycaster model. And so that's what I wasn't sure if it was going to figure that out. And so here we have our plan. So build a browserbased first person Wolfenstein game. And I wanted to implement this plan. The only thing is that um it's not like it's going to store the file anywhere. And so generally what I would probably want to do is save the plan, right? And tell it to write it into the Wolfenstein 3D folder, but I'm just going to let it rip. Okay. And I just want to see what will happen with such a large plan. And we're at 98%. So we got a lot of room for this to go. And so we are going to Let's go off to the races. I'm going to

do is just I'm going to copy this because I don't want to lose it. I'm not sure if plans get saved anywhere, right? And so I just want to save this if I can. The only problem is like I can't do control C or it will kill the script. So I'm trying to copy this. I can't. So I'll have to just let it run here, which is kind of frustrating. But we will pause here and if it if anything pops up here and asks us to do anything. But it does have like checkboxes. So clearly there is a plan file being managed over here. So I'm curious if we go here. How does it know that it's managing a plan? That's what I don't know. And so while it's going here, because I don't remember there being

much in the docs and this again, this is the big problem is just like lack of documentation. And I'm just trying to offscreen here get another tab open. Chrome seems to be slower these days. So go over chatpt. How does plan mode in codeex remember what tasks it has to do? And is this saved anywhere? And I know this is in the docs because I specifically looked for this. Um I know that claude will save things in a plans here. It says it doesn't remember tasks.

Some workflows explicitly store it. Uhhuh. Stuff is in the sessions data. It's very useless what it's telling me. Very useless information. Um and so probably what I would have pro preferred to do is say instead of creating this plan, write it to a file, make a task because like how's it going to keep track of this stuff? But it seems to know that it has an internal checklist. But where that's stored, I don't know. Um, we could go into our sessions here. 04. I'm just like trying to check this while it's running. Okay.

Oh, I don't even know what the latest one is. Well, how do we know? Um, it's this last one right here. And so I'm just going to go ahead and just say like code enter because I'm just curious what like what the contents of this file is. And so what I'm looking for in here as I was trying to find if uh like it was maintaining some kind of list. Again, this is still working. This is a big task. We gave it a really large job to do. Um, and so another thing I want to check is maybe the SQL light file because I want to see like how's it managing that information again. Maybe it's just

handling in context. And so we made a cheat sheet somewhere here. I think it's in our hello world. No, it's in our tic-tac-toe. Here we go. Um, and here I'd like to see like what tables there are. So I would say esco light state five and we can just enter it actually here and then we can Nope, it didn't do that. So we'll go ahead and I'll just uh Oh, maybe because I forgot the three on it. There we go. And so now I can just type commands like tables. And so here we have agent jobs. So I might do um uh how do we check the schema again?

Schema. So schema jobs, right? Jobs. I would think those are background jobs, right? And then there could be like agent jobs here. Let's take a look here. Agent jobs. Okay. And again, this is still running in the background. Oh, hold on. Do you want us to allow installing all the dependencies? Yes. Yes. Don't ask me again. You can go ahead and do that. Um, and so see like select all from agent jobs. Let's try this.

There's no data in there. Select all from jobs. All we're trying to do is seeing like is it queuing up any information anywhere? And so I don't think so. Agent job items. Okay. and we'll go back over to here. And so I guess we just got to keep an eye on it and when it shows something, do something. Here we're already at 88%. So we're using quite a bit. I'm going to pause here uh just so you're not watching forever. Okay, we are back and it literally checked off a list.

Couldn't find that state information anywhere. Um and it says it's implemented something. So I'm curious. Um and we will go and make a new tab here. I'm going to CD into our Wolfenstein directory. I'll do it mpm install. One thing I am going to do because I'm not sure if it um added that file. Yeah, there's no git ignore here. Say get ignore because I really don't want um the node packages or node modules otherwise that's going to add way too

many files for us to manage. Okay. And I'm going to do mpm install mpm rundev. And so that's open. We'll open that in our browser. Look at that. Wow. Okay. And the reason why this is a wow for me. Oh, this did not work last year. This is incredible. This for me. This is incredible because I swear to you it could not handle the logic for this.

Okay. And I'm able to fire shots. This is so cool. Uh I'm definitely going to make a Wolfenstein game that nobody needs. Okay. So Oh, there we go. It's defeated. Great. So yeah, we can now make Wolfenstein 3D. I'm curious like what the code looks like. Uh you might not care about the code, but like here, yeah, we got WebGL. Yeah, it looks fine.

Okay, so just it's literally rendering the vertices. Yeah, that's fine. Yeah, it uses ray casting. So there we go. We now have a Wolfenstein clone. We know how plan works. Um and I'll see you in the next one. Okay, chow chia. So a sandbox is a security mechanism for separating running programs from your OS. And a sandbox provides tightly controlled set of resources for guest programs to run in a storage and memory scratch space, limited network access, the ability to inspect the host system, uh disallow or heavily restrict reading from input devices, and codec. Codec specifically relies on sandbox libraries. This is the same as how claude works. And so there's one called

bubble wrap and then there's one called seat belt. You'll notice there isn't one for Windows as I don't know, they just don't list it. Maybe Windows already has sandbox built into it. But generally if you're going to use Windows, you should be really using WSL 2. Um because really sandbox is about controlling like bash commands and that's running Linux. So generally that's why I think these things are focused on um Linux or Linux- like environments. Okay. Uh but anyway

for codeex we have the sandbox which defines the technical boundaries and an approval policy which decides when codeex must stop and ask if it can do something. So the idea here is that we have a set of sandbox options. There are some things in here that are not necessarily sandbox but they are definitely related. So you probably would want to know about them. The first are your approval policies. So we have untrusted on request and never. So um there's that. Then we actually have our sandbox mode. So we have read only workspace write and danger full access. So workspace write basically means it can more the time more or less edit but it's going to rely on the approval

policy to determine you know when it's allowed to do things. Um you can set up very sand sandbox specific um settings. So you have workspace, right? But you might say, "Oh, I don't want that to have network access or not allow login with shell or disable web search, right?" And those are things that are associated with a um a sandbox environments which you might want to lock down. You can also have OS specific settings. So you might say, well, for Windows, we'll have sandbox elevated. I could not find where this option was. The uh sandbox elevated. It is in the documentation, but they will not tell me anywhere where it is. and I'm noticing inconsistencies with the docs versus

what I'm doing. So, understand that I'm not trying to mislead you. I literally cannot find the backing information. And Codex cannot find it, but I leave it in here because it might exist and you might come across it later. Um, Codeex recommends to run in either workspace write and ask for approval on request or readon. So, there are those two options probably the first one if you need to. Um, so just consider that if you're working WSL, you're basically working in a virtual environment. So you do have some safeguards already in place, but you may want to run a sandbox as well.

There's also this option to check um you know the commands and see if they work. So here they go sandbox Mac OS and you do the command and either will work or won't and we can log the denials. There's no documentation on how to use this and so I don't really have a use case to test it so it's not really a big deal for me. You can also add uh very specific red directories that it's allowed to read from. Um but yeah, there you go. That's sandboxes. Hey folks, it's Andrew and in this video I want to take a look at um sandbox features. And so the idea here is that we should be able to um Whoops. If I type here just a moment. Type.

There we go. Uh we should be able to um launch Codeex with the sandbox. Now you probably need to have bubble wrap um already installed. And so I would imagine the instructions would tell us that. Um I didn't notice that it would prompt us for it. In fact, I might already have bubble wrap installed. So I'm just going to look for the command to install it. Um so I'm going to go here. Bubble wrap install WSL. That's what I'm on. It'll be the same thing for anything else. So I'm just looking for the uh the install. It might actually already be installed now that I think about it. I never installed bubble wrap. I think it's already there. So, I'm going to go ahead and I'm just going

to start sandbox here. So, we'll say sandbox and what was it suggesting sandbox uh read only and then we'll say ask for approval and we'll put here on request. Okay, so that was its recommended method of turning on sandbox. And here we are now running in a sandbox. It doesn't indicate that we're in a sandbox, but I want to see if we can actually edit a file because if we are in readon mode, um we should not be able to edit one or I guess we'll see anyway. So, um we'll go here and it's like, can you change the TypeScript code uh code to JavaScript? And right now we are in the Wolfenstein project, which by the way works. And I want to see if it will actually

try to write it and get blocked. Okay. And so we are waiting to see what it does. So I'm hoping that it'll just say, hey, I can't do that because we said readon mode. But that just means that the sandbox couldn't do it and then it would go out beyond the boundary and then ask. And so maybe um it will on request ask or maybe we will have to set it to something very strict like uh you know untrusted. Okay. And so we're just carefully watching to see what it's going to do. And so I'll edit that now. So, it's saying it's going to edit it.

It's really taking a long time to get to editing. I'm not really sure as to why. Maybe I should have asked it to do something simpler like edit the hello world. But, um, yeah, maybe this task was a little bit too large for it. But, I'm going to just pause here and wait and see what happens. This is taking a considerable amount of time, so I'm going to stop it. And maybe this task is too large for it. And I'm going to go to our hello world where we literally have a single file. And I'm going to start back up um the sandbox mode. And actually, I'm going to be really aggressive here. I'm actually going to set it to untrusted. And so that means that what it will do is like it really

requires approval to do anything. Okay. I'm going to say can you port the at sign hello world.rb to uh Python and I want to see if it will complain and not do it. So there's nothing inherently dangerous that it's doing here. And so here it's edited it. And so I say like why didn't say escape? Why were you able to edit the file? I set sandbox mode to um readon and approval to untrusted. And I mean like this is a very like low lowrisk action. So it might be hard for us to trigger the sandbox. Okay. And so we'll go ahead here and say yes.

Oh, it says I wasn't able to edit it. Uh, it's it's not present in the workspace. So, the file was written call. So, read only plus untrusted clearly block normal shell rights unless I requested escalation. I used apply patch which is a separate editing tool for the shell commands. This tool can still be invoked by the agent and when shells rights are restricted, it actually succeeds on the harness runtime. In this case, it did not complete and your interruption prevented any from landing. I was able to edit but not actually perform one. So if you want I continue. So okay attempt to edit the file actually edit it. Okay. And so maybe

it's just printing out here. It literally cannot edit. And so we'll go ahead and we'll say uh would you like to make the following edits? So it's actually asking me and we'll say yes. So it actually had to go ask for approval and so now was able to do it. So it seems like that if it's here, it still had to reach unass. So it's not an immediate denial, but it did have to explicitly ask for that change. And so I do believe that sandbox mode is working as expected. Um, we could also try to disable like web search. I'm going to make a new file which will be our uh config toml.

Okay. And this needs to be in uh wherever we launch this up. So right now we are over here. So I'm going to go over to this file here. I'm going to make a new folder called codeex.codeex. Well, I'll get rid of this. I don't know why they do that because then I have to delete it. And then make a folder, right? I don't know the logic there. It doesn't make any sense. And then we will drag in the config toml. And then now that we have it, we'll go ahead and we'll try to say um web search. Whoops. Web search disabled.

I'm going to do double quotations here. Um, approval policy untrusted. So, it still will have to ask us, which is fine. Read only. This will be uh sandbox mode read only. Okay. And so we go ahead and we'll save that. I'm going to stop this. Clear this. I'm going just type in codeex. And so now I'm expecting not to be able to go to the internet. So I'm going to say, can you fetch me the weather from the internet? for uh Calgary.

I can't access the internet so I can't fetch it. If you want I can do a local command or if you want I can do that. So uh you know try the curl. Let's see what happens. The shell and whether to see this is possible. So now it's asking for approval. And so you see that we can Oh, here it says failed here because the environment has no working DNS network access, right? Because we've denied it network access. Um I wonder if we turned that on if that would work. So we might go here say network access equals true.

Okay, I'm just going to stop this. I'm going to continue on the conversation from that we just had here and we'll say uh I turned on network access. Try again. Okay. And it's going to go out there. Let's see if that helps it. It still might not be able to resolve it. Okay. And we'll say yes. See what happens. I'm terrible at networking. So there we go. And so you can see it.

Notice it was off by uh off by false. So the sandbox probably restricted the network access as that's what it's supposed to do. And here we're explicitly turning it back on. Um and so there you go. That is sandbox and approvals in a nutshell. Okay, cha chia. Okay, so we have permissions which lets you change your permission levels and we have default which is the current and full access. What's confusing here is that um and I didn't mention this in the sandbox video, but you always have a sandbox mode set. Whereas with claude, we're talking about Codex right now, but with claude, you have to explicitly turn it on. Whereas um Codeex, you're always in some kind of sandbox mode. Uh and

then you have approval policies with the sandbox mode that can replicate this. But for some reason, they have this override level with permissions where you can switch from default to full access. What's even more confusing is that in the docs it actually says readon and auto mode which clearly that is not the language used here but in this in the CLI we have hyphen full auto. So it's very confusing. My logic here is that this is just an easy way to override the mode while you're in um the CLI and it's an option and it's going to get more confusing when we look at dangerously bypass approvals which we will look at next. Okay.

Okay, so let's take a look at dangerously bypass approvals. And the idea here is that you can dangerously bypass stuff. Why would you want to do this? Well, if you are running in an automated system and you know exactly the impact of running uh without regard of having to get any approval, then that's totally fine. So people that run autonomous agents that are going out to the internet and doing stuff um you know they're running this mode, right? They're not asking for any approval. But um it's a bit confusing because it seems like there are two modes. We have ask for approval never or a never or dangerously bypass approvals and sandbox which sound like the exact same thing

and probably are the exact same thing. Um I'm probably not going to make a lab on this to show it as I just don't have a use case to ever bypass stuff. Um if we were making automous agents then sure maybe. Um but we're probably not going to do that here. I just want to show you that it's like this and it's similar to Claude. Um and permissions is confusing because we have this permissions uh sandboxes with approvals, right? And how they intermix not clear by the clear by the docs and not even clear when you're working with it directly. So, you know, just do your best to manage your permissions. Okay, let's take a look here at codeex rules. So, rules control what commands can run

outside the sandbox. And so, here's an example of a rule. Uh and so the idea here is you have let's say a prefix rule and so it's going to check if the prefix uh that the thing that's operating like you know is it uh GH or a PR or a view what decisions being made what justification what does it match what does it not match um and so yeah if you know claude code um this is kind of similar to kind of similar to their uh permissions um uh like their allow and uh deny and ask permissions Um, and rules have nothing to do with cloud rules. That's something completely different. Um, but yeah, we'll take a look here at codeex rules. Okay.

All right, folks. What we're going to do in this video is test out rules. I'm going to go back a directory and we'll say make directory rules test. And in here, I'm going to cd into that directory. And I need some kind of very simple web app to test it. So, I'm going to go ahead here and type in codeex. Okay. I'm going to say using Sinatra create a couple of endpoints that I can test using curl. Okay, I have Ruby installed on my system. If you're on Ubuntu or whatever, you'll need to have that installed or you'll have to prompt and say, "Hey, can you make me a Docker file to run it?" Um, the other thing that you could do is, you know, just use a language that will work for you. But

basically, just ask for a micro framework um to have some endpoints. That's all we're doing. So, I'm just using what I'm familiar with. And so it should just take a moment for it to install. Mhm. Okay. I'm going to give this a nice refresh and we'll go over to our rules test here. So now we have a very simple application, right? Um and so uh allow installing gem so can boot and verify the curl. So yeah, sure. Of course. Um, I don't need it to curl right away, but um, I wanted to make a read me, so there's some curls that we can test. Okay, we'll give it a moment here to get set up. All right, and here it wants to

start the application up. I don't need it to start it up, but can you I'm going say, can you create a readme file uh, markdown file with some test curls uh, for our app? Okay. And so all I'm doing is just so I can run them manually. And we're going to end up opening another tab here in just a moment uh to make stuff uh sense of stuff. Oh, did I just close it? Oh, no. I just closed my tab. I didn't mean to do that. Um and so I'll go back here to codeex.

Well, sorry. We'll go back into rules test codeex. And um I just had too many taps. Okay. There was too much was going on. I couldn't see what I was doing. Um, but we'll go back over to here. Go back to this one. Sorry. Uh, please make the markdown file with curls. I just uh interrupted by accident. Okay. And so that will continue on here. All right. So, it's came back here with our readme file. Let's go take a look at what we have here. Okay. And so we um these curls here. So what I'm going to do is make a new tab. And I want to

start up the server. Um so I'll go into rules bundle install. And I'm not exactly sure what they want the command to run it. Usually it's pretty darn simple. Yeah. Okay. It's just bundle exec Ruby. Fair enough. We'll hit enter. And we will give that just a moment. So we have four five six seven. Good. Um, there's probably like a six seven joke in there. I got really young kids, like really young, and it's beyond not being funny anymore. So, I apologize for that. Okay, we're going over here, and um so now we have that, and I want

to just test it with an endpoint and see what results we get back. Okay, so we get this message back uh with this. Okay, great. And so, that's working really, really good. Um, so what I'm going to do is I'm going to go ahead now and make a new folder. Well, first we'll get rid of the codeex folder as this is silly. And then we will make a new folder here calledex. And then I'll make a new rule real rules file which is like block curl dot um rules. Okay. And we'll go back over to here.

I think we're over here. And so uh can you create a codeex rule um in block to prevent curl access. Okay. And it's pretty simple. There's like three things we'll fill in. But if we can ask it to do it to get initial one, then we can check if it can even write this herself. Yeah, it does already exist. That's true. And so here it's going to go ahead and work through that. Okay. And so here uh there aren't any local examples to anchor the syntax. So I'm using a simple rule file direct. Uh no this is no that's not how you do it. So we'll go over to here. I'll go find um the rules wherever it is here. Codeex. Uh here's the link. Codex rules. Here's the page.

It's crazy that it's just like trying to do that. No. Okay. No, there is a format. This uh codeex rules have a specific format. Surprised it doesn't know. When I talked to CHPT, it knew. Um and again, we could write this out in like two seconds. Um, I would just like to see if we can leverage this stuff because you might have a hard time writing rules and it might help to alleviate that pain instead of writing it by hand. But again, I'm expecting three fields here like pattern matches. Yeah, pattern, match, and justification. Right. So, we'll go ahead. We'll say yes, proceed.

I'm not saying that's a good rule. I'm just trying to get it to work here. And so, let's take a look at what we have. So, pattern curl and it's on a prefix rule. I don't know if there's any other kind because I didn't see any that wasn't just prefix. So, maybe they do not have anything else here. I'm pretty thorough when I try to check the stuff, but uh Oh, here's the language spec down here. Let's go take a look. And yikes. Okay, let's find the prefix rule. So, yeah, that's not great. It's using called Starlark. Oh, so it's already using them something that pre-exists. So the Starlark language specification.

Oh, gross. Okay. Well, I mean, that's what they got. So, we'll go back over to here. And so, we have curl decision forbidden justification. HTP access via curl is blocked. Uh, use another approved tool or test the helper instead. So, here it's trying to help it to know the context of like how it should uh have justification for it. I think they explain it here. If we go back, they might explain a little bit better than what I'm saying here. So, a non-empt, sorry, a human readable reason for the rule. Codeex may surface it in approval prompt and uh or rejection messages when use forbidden. Include a recommended alternative in justification when appropriate. So, there it's talking

about an alternate path. I would think that if you didn't, you just say don't do anything else. It's fine. Just deny it. But here it says if it matches this, this or this. If it doesn't match that, that I think you could also um is match optional. It defaults to an empty bracket. So, validates when it loads. Use these to catch mistakes before the rule accepts. Okay. So, I think we could just also um undo it here. But I guess the non match might be good if it gets confused and it's blocking for the wrong reason. But uh again, I would expect it if it just has this. That should be enough, right?

Well, we'll go ahead and we'll try this again. So, um I want to go ahead and just say, okay, so I'll stop this and we'll start it up. Please test uh the Sinatra app using the curl commands found in the readme. Okay. And so it should be able to probe it and I'm expecting that to be blocked. If it doesn't work, it'll probably find a workaround. Is reading the read me. Good. Mhm. Come on, probe. Let's probe. And so now it's saying failed to connect.

The server is definitely running here. Okay. And so, no, no, it's running. So, I'm going to stop this for a second and I'm going to make a new file here. It'll be the configtoml. I just want to make sure it has network access. So, we'll say network access equals true. And I'm going to stop this again. We'll go back here and we'll go to resume. Sorry, I gave you network access. Try again. I mean, it should already have network access, but maybe it doesn't.

No, no, it's already running. on host at 4567. It already is. We'll see what it does here. Okay. So, would you like So, do you want to allow running curl outside the sandbox so I can reach the host app? Um, sure. What I'm trying to figure out is like did my rule work? Okay. Dab is running live. I'm skipping running four. So curl failed to connect and here it asks again look I'm I have implemented a rule um and I'm just trying to uh that

stops implement a codeex rule and I'm trying to test it's working so I'm expecting you to tell me curl uh is blocked Um, why aren't you doing that? Okay, so like Did not actually block me from curl one escalation was available. So inside the sandbox curl failed, but that was a network namespace issue, not a denial policy per active instruction. When important commands fail, I must retry sandbox required escalation after that. So if the goal is to test the codeex rule that forbids it, the rule needs to prevent escalated command from approving in this run environment. It's still allowed. So

correct behavior was not used. So if you want, I can help you inspect the rule. Yes, please. So not exactly sure what it wants. There's obviously sandbox permissions. That's even an option we didn't even come across in the docs whatsoever. And again, this is and that's assuming that Codex is not just lying to us right now as it's very hard to tell what is what. Um, but maybe you can make a suggestion. Uhhuh. Yeah. And it's reading our config toml. Cool. So what do we do here? Codex. We'll give it a moment. All right. So we are back. So it says the rule is not taking effect because it's in the wrong place.

The docex toml contains network access true that enables access but does not enforce the rules file. Um, so your custom rule ex Oh, is it not in the rules folder? Oh, that's why I didn't notice it's because I've done this more than once and so I wasn't paying attention. I thought I had created it, folks. I make like the videos three or four times over. That's my problem. Okay. Uh, so now it's over here. Your curl wasn't blocked. Um, so that was the one thing we had to change. And after that, we'll test with a plain one. So um okay so okay we'll just go out of here and we will go back here and we'll say resume okay uh you know please okay so I fixed the place of the rule

please try again did not even notice I did not put in the right I don't know why it doesn't just remember what it's ran, but here it's attempting. So, codeex allowed the curl to execute. The command failed to the runtime. If the rule were active, it would have prevented it. Yeah, I know, right? So, the result is your rule is still not being forced. Most likely, the session uh needs to be restarted, which I did do. The rule file name or directory is still not correct. Couldn't it just check that? Uh the rule pattern is not matching. if you want. Yeah, please inspect and please debug and check. I don't know why this is so

darn hard. I mean, that's where they say to put it in the rules folder. The only thing I can see here is they say default.ruules, but they're saying create a rules file in this directory. And that's where it is. Okay. The docs definitely say it's project based. So, it's double checking. I have enough local evidence. Uh, the rule is not being read. I'm checking the official stuff. Good luck checking it. The rule is still not active. Your current file is here. The official rules say to create this and that and then do this. In this session, the only active rules I can see are the loaded global rules path over here. So, created rules in the etc. Restart it. So inference

is in the wrong local location. If you want a project local rule, you put it here. What are you talking about? We're right in it. Uhhuh. Let me just stop here. Where are we? Well, what? It's where it's supposed to be. Um, all I can think of is move it here. But this is bizarre because we should get it to work in projects, right? If you want a project rule Oh. Dot for/codex. Oh, it's just a folder called codeex. Well, well, how did that what? Okay, hold on here a second. Hold on.

Hold on one second. Am I to believe that the folder's name is Codeex and notcodex? How did this work earlier then? cuz this definitely worked when we set all these settings. So, how did it load that if that was the case? I mean, it definitely is saying. So, this one's codeex and this one's just codeex. What the heck? But if we go over to config files, let's go over to here.codeex in project overrides. That is crazy. Okay, let's go over to here. That is so confusing. Oops. Uh, not over here. Um, and we'll go over to here. We'll make a new folder. Codeex. When they say they're vibe coding these harnesses, I totally believe it. So, now it's over here. We'll type in

clear. I'm going to go ahead and run Codex again. And then we will resume try curls again. I made a new codec folder. Seriously, why have two folders with one dot and one knot? Codeex allowed the command to execute only failed at runtime. If the rules were active, it would have stopped before. I'm not crazy. Okay, I'm showing you I'm not crazy. If we go back over to here, codeex folder or over here. So, the next thing I'm going to do is I'm going to rename this to defaults.

Okay, we'll just try a completely new session. And I'm going to be very specific. I'm going to take this command here. Just this one here. Run this command. That's it. Okay. So, anyway, I'm not exactly sure how we're supposed to get this rule to work. Um, but the point is, you can see that it is very, very difficult. Um, I don't know to tell you, it doesn't really work as expected. Uh, I'm not crazy and it can't figure it out itself. Um, but we're also kind of fighting with the fact that here it doesn't have network access, and we did open it up, so that'd be less of an issue. I suppose

the only thing we could do is we could try to like get rid of the rule, even though it shouldn't be even loading anyway. So, I'm just going to go ahead here. I want to just focus on ignore this for now. Say ignore codecs. I'm just going to go ahead here and say um yeah, you I have on my host machine. Let's just make sure this is working, right? Yeah. So, I have on my host machine uh a server running. This endpoint works. Uh what do you need to be able to do to hit this endpoint? Let's see if let's just rule that out first. Okay. So here it says local host inside my sandbox session and not your host

machine. So uh does not mean I could uh I'm a does not mean I can reach it. Um expose the server on the address environment. give it a reachable host address or typical ways is to bind on port 00. We'll go over to here. We'll bind on port 00. That's a good idea. B 0.0.0.0. Fair enough. I'm terrible at networking folks. I'm not a networking engineer. Okay. The networking folks here are probably like, Andrew, you're dumb. And I'm like, yes, I am dumb. I don't know how to network. And it's already binding to port zero. It's already set for that.

Okay. So, we'll go ahead and run this again. Come on. There we go. It's already on port 00. Well, what are those curls going to? What the heck? Um, it runs on port 000000. The server is listening on all interfaces. So for me, I still need an actual reachable address uh from this workspace. The blocker I do not know exactly all you want me to I don't know. I don't know sandboxes well enough. How like how do you network uh codeex sandbox to this me? No network knowledge. Okay. Like I don't know what you want from

me. I want I don't want to read a thousand paragraphs here. You don't need a network on a codeex uh code box codeex. Keep your server listening here. Find the host machine IP address and instead of that do that why this doesn't work from this that points to local environment not your host. Yeah, of course. Sure. Fine. We'll do that. That seems really silly though. Well, I mean can't you run IP config and find out like why am I doing this? And can't you work outside the boundary to find out?

Uh-huh. Okay. So, yeah, but you are in a sandbox. Can't you just go outside of sandbox and find out? You could have just asked me for approval before. Here we go. No, you are stop. You are running in WSL and this server is in WSL. If the server is running WSL then the address is just this. Okay. The real issue is not Windows WSL is the codec session shares the same network name space. So if Codex is using WSL it will work. If Codex is in an isolate sandbox I can do that. So what I need to do is it's not a different IP. I need

permission to test uh from the from here the outside because this will tell me immediately whether the session's there. Okay. Well then ask permission please. Look how confused it got because it doesn't know anything about itself, right? So, I'm requesting say yes. There we go. Now works great. So, the answer is I need permission to run the network in your setup.

Okay. So, now the next thing is I'm going to rename this. Put this like this. Okay, we'll stop this. We'll try this again. Okay, please run this curl command. You have uh ask me my permission so you can run outside your sandbox. This uh is a Sinatra server running on WSL and you are in WSL. So you are on the same network. Okay. So we'll go ahead and just help it know a little bit more. Mhm. And so we'll say yes. And it succeeded. Great. I have a rule in codeex um defaults rule and you did not uh have to you were not prevented by it.

Well, rules. The rule exists but doesn't apply to the path. Oh my goodness. This is not smart. Okay, but like No, it's right here, right? It is called It's called rules and people complain about cloud being dumb. This is so stupid.

Like, see how it got hung up on that one s there. The rule did not stop me because the command was executed through tool level escalation flow and I explicitly requested approval in the session the active execution. Okay, so the point is is that rules are a pain in the butt. I'm not exactly sure on how to best invoke them. Um, they suck. Sorry folks. Um, but it seems like maybe it's talking about what it can do within a sandbox. Um, and you would think that if you give it permission to curl and it would run into a forbidden or stuff like this, then it would take effect. But I

can't just get I can't just gonna get this to work no matter what. So at least we know how the rules are defined. If you figure it out, you tell me. I'd love to know and everyone would love to help. But there we go. Okay. Okay. So for codeex, it has a configuration file that comes in the form of a toml. So it's literally config.toml. And this can be in uh more than one location. Um there either there is either the global scope location uh where it sits in your home directory with codeex. Okay. And then uh there's the option where it's within your project and it's just a config.tml.

Um and so those are the two locations that it can exist. We will double check triple check the uh project configuration file as I'm finding that codeex's conventions for naming stuff and placing them is not consistent. Let's take a look at some of the things that we can set. There's a lot of options. So, we won't go through all of it, but here you can see we can set what model should be used. Um, specify the model catalog based on its JSON. Uh, you know, set the sandbox mode, whether we can do web search and if it's caching or turned off or if it's live, set its personality, set the reasoning effort, sandbox information. Um, but most of the time you're going to be

going in here to u meddle with your sandbox information. But there again, there's a lot of options, but we'll cover what is relevant because there's a lot that isn't. Okay. Hey, it's Andrew and this video we're going to take a look at um adding a configuration file. We've probably done this before um but let's just make sure that we absolutely know how to do it. Um, and so what I'm going to do is maybe go into our hello world, which we actually already have a TOML file.

What's interesting here is that I have it in a docex folder called config. But I think when we go back to our um docs, it was saying it could sit there in the directory directly. And just there's so many places that it can load. So let me just go double check here so that we can uh take a look here. It says codeex config toml and then one there at that level. Okay. So, what we're going to do, even though we already have one, I'm going to make a new folder here. And this will just be called um uh simple math. Something simple here. And so, I'm going to CD into simple math.

Oh boy, that is a folder. Come on, let me in. Simple math. Oh, it's an underscore. That's why. And now that we're there, I want to go ahead and we'll go make a new folder here called codeex. And then we'll make a new config toml file. And what I want to do here is just change which model is being utilized. So I can't remember what models are available to us. So I'm just going to go ahead and type in codeex and we will see what models that we can use. So here right now we're on the GPT 5.4. I'm going to type in model and we have uh four mini. So, I'm going to go up here and we'll say GPT 5 4 5.4 mini. I'm going to match it based on

what it says there. I'm going to save that file. Close this out. Clear. Type in codeex. And all I want to see is does it load that model in. And so there it goes. It loaded it in. That's as simple as it is. Um, you know, that's all I really wanted to know is to make sure that you know how to set it and where to set it. Um but yeah, there you go. Okay, let's talk about non-interactive mode. So, Codeex has a non-interactive mode, also known as headless mode. That's what Claude calls there. So, I'm just trying to u normalize that information across providers here. Um,

but it allows you to execute requests without entering into interactive terminal. And so, here is a bunch of examples. Let's walk through them. I'll get my pen tool out here so I can point to the ones that we're on. But the first is just use codec codeex execute and you just provide it a prompt and it will execute. We have the ability to pipe things. So here we are piping out. We can pipe into pipe out of with codeex. Uh we can set uh a firml so it will instantly forget. It's not going to create a session or continue on. We can set the permission level. So here we have full auto mode. We can set the sandbox which we cover in the sandbox

section. We can uh ask it to output JSON here explicitly and then you can pass it to JQ and do something interesting. We could pass our codeex API key. So u we are not utilizing our subscription um or if we just have different keys and we want to uh leverage multiple keys at the same time. We can resume a previous conversation. A very common use case for non-interactive mode would be GitHub actions. Uh so here you can see we're installing codeex and then running it.

um and then running it here with very specific settings. Technically with GitHub actions, there is a very specific um uh GitHub action for codec. So when we do the GitHub action sections, if I make a video on that, um we'll probably use the action instead of that. But if you need full control within any environment, if you're not even using GitHub actions, you're going to be uh using this mode. Okay. Okay. Okay, let's take a look here at um the non-interactive mode. So um I don't think it really matters what repo we're in, but I'm going to go ahead and maybe we'll go into our hello world as that is

a very simple application uh and a great place to check. I'm just going to go ahead and sync whatever I have here. But we'll go ahead and try it. So we already have Codex installed. So I'm going to try codeex exec and I'll just increase the font here so you can see what is going on. And sometimes it helps to do this. So, I can move this over and we'll just say uh tell me what is in this repo or this project folder. Okay, let's see what we get back and if it works. Okay, and I'm not sure what the issue is, but it's possible that I haven't logged in a while. So, whoops. Oops.

Oops. Actually was working. We'll we'll try that one more time. It was strange because it's showing error. So, tell me what's in this folder. error. Snapshot or validation failed. Snapshot command exited. I don't know why it's complaining, but it's working. It's doing something. That was a little bit confusing there, that initial one. So, at a glance, this is the minimal project. So, it definitely works. I don't know why we get that initial error, which is very,

very confusing. Let's go up here, and I'm going to just say uh JSON. We'll say JSON as they say we can do this as for me tell me the project um provide a JSON uh JSON structured report back and uh for fun we will also pipe this over into jq and do period and so we will hold on and what's interesting is like it's going to stream the information and so we're getting back JSON and we're getting it formatted. So here we can see the information back. It seems like it streams everything back, not just um not just the uh the single command there. That first one's kind of strange. It's kind some kind of error that doesn't seem to matter. Shell snapshot validation failed. So it does that

doesn't matter. But here we're getting back all the input and it's showing us every single step. So this is really interesting. I thought it would just come back with the final results, but literally every single step that it's streaming out is coming back. That is really, really cool. because that's going to give us a lot of flexibility in terms of how to work with stuff. Also, it tells us exactly how it works. So, you can see turn.completed um you know other item completed stuff like that. And um it might not seem obvious here because we don't really cover in the course but like I have another course called claude architect um foundations and when you know the

internals here uh you know if you have to build more complex systems it helps you uh understand how to work with it but we don't cover that here because we are just learning how to use utilize this for our usage not necessarily build complex things. Um is there anything else that we might want to test? I mean those are the two most common use cases. Um, we might want to resume a conversation from prior. So, I'm going to go here and just type in codeex.

We'll type claude, of course. And we'll type in resume. And I just want to see what else there was here. So, here we have, can you fetch the weather from Calgary? I don't know if we have to type this exactly. And so, I'm going to just copy this here. Just that there. And I'm just going to quit out of this. We'll say codeex exec um last. So we'll say resume last. I think last means resume the last session. Oh, we restricted this machine quite thoroughly. So that's why it's not going to be able to pull it as under our hello world. At some other point we um we were really aggressive with this configuration tumml and that's why we

are not able uh to do that. So it still works. And so I think last just means continue the last conversation, not B because over here it looks like okay, type in the name. And so that's how you know which one to resume, but really it's just resuming the last one. Okay. Could you do other stuff with that? I'm not sure. Um, let's take a look here. Codeex exec help. There's more information. And because the docs aren't always up to date, right? And so it might just be here. So here it says provide a the a previous session or ID or pick from the most recent one. So here we can provide

the session ID I suppose. Um so if we were just to go here and not put last try this let's stick this out of here. Do we get to choose? No it doesn't. And so that's not clear. Right. So it's saying like we get to choose. Obviously we don't get to choose. Uh, we got some uh colors here. I'm just see if there's anything else we miss. Tell it to use it in a very specific directory.

Um, I mean that could be really useful because maybe you're not running it where you actually are. So you might want to do that. Change the model. Yeah. So there's obviously more options here, but of course the man helps us out. The examples I showed here was from the docs. So you can see the docs don't always match. Uh well technically this is documentation just in another way. So just always make sure to check everything but there you go. Okay. So with OpenAI we can interact with it on uh different uh layers or levels if you will. Um and this is important to know because some things will only be uh available at certain levels. And so the lowest level is going to be the open

AI rest API. Um and so what's interesting here is that OpenAI does not have a generic um library. Um so for example, if you're using uh Anthropic Claw, they literally have a library called the Anthropic SDK, which is specifically just a wrapper for uh their API. Um there's probably some third party ones out there, but I mean it's not hard to work with it directly. So here's just an example of using uh JavaScript to interact with it. Um, but yeah, we'll just make sure that we understand the layers is there's more layers that are abstracted with OpenAI versus Claude and that was just the choice that OpenAI went with for whatever reason. Okay.

Hey folks, it's Andrew and in this video we are going to learn how to um interact with the Open AI uh REST API. So I'm going to make a new folder here. We'll call it OpenAI REST. Um and we will cd back a directory and we'll go to open AI rest and we'll type in codeex and we just want to do a simple uh action. So I want to demonstrate uh using the open AI rest um directly. So can we do that using let's say Ruby? I really like Ruby. So we'll go ahead and do that. um just create a simple Ruby script, you know, using HTTP party, I guess. Um and all HTTP party is just a way to send out a query. Um so every language has one. I don't know what the other

ones are called, but it's just basically a way of invoking it. We can also make a curl command as well, uh to test it. But we'll go ahead here and take a look at what appears. But as far as I'm aware of, I mean, there's probably an unofficial library, but there's no official library, per se. So, we will um see what we have here. And I'm hoping I'm in the right directory. When I was generating this set, I wasn't paying attention. So, where did it Oh, no. It put in the right directory. So, here it is. It's going to bring in that open API key. And here, it's going to do responses. It's going to use mini, and it will take an input. So, write one sentence about why direct

REST APIs are useful. So that is fine. And so I'm going to need an API key here. So what I'm going to do is stop this here. I know how to run this. It's pretty darn easy. So we will um do a bundle install. Okay, I'll just make the instructions here. Okay, we got bundle install. And then once that's installed, we can run this. It'll just be bundle exec ruby open aai rest.rb. So that's what we will need to execute it. So I just go need to go get a key. So I'm going to go find one. We'll just type in open AI developers. I'm just doing this offscreen so you cannot see what I'm doing just yet. And I'm over here. I'm going to get a new key. And

we'll go ahead and this one is from a day before. So we'll revoke that key. And I will create myself a new key. This will be test REST API. And I'm going to go ahead and create that key. We will copy that key. And now I need to set it. So this is going to be uh export open AI uh API. So OpenAI API key equals and then we'll paste it in. We'll hit enter. And so now let's go ahead and see if we can run this and invoke it. And you know the main thing here is that there's more models than what codeex is using right so there are image models there's all sorts of models and so if we want to have access to it we'd basically be having to use this API are useful

because they engage oh so they're just here is the prompt so it came right back here and so I can go back over to here and we will just change this to say um say you know give me uh studying uh give me a bulleted list of studying tips to pass the JPT N5. Okay, I just took the N4 in December and I failed it and so I need all the tips I can use here to pass. Okay, so we're going to go ahead and wait for it to generate out and so it comes back. So here is a bulleted list, learn hiragana, build a basic vocabulary, etc., etc. So it works. Um, and that's all I really wanted to show you. Maybe we can convert this into a curl. So I'll go ahead and just say, "Hey,

can you create a curl command uh from my open rest uh code?" Okay. And we'll just wait for it to generate that out. Should be able to do it. We're still using the intelligent model. Not we're not I mean Mini is fine as well, but we are using the one that's appropriate. And so here we have So, we have equivalent. If you want it to behave more like Ruby script, I don't care about that, but we'll go ahead and grab this here. I'll make a one here. So, we'll go down here. I'll just paste it in. And so, in theory, this should just work. So, go ahead and hit enter.

And nothing's happening. Oh, there we go. It's back. Okay. And so there is our response because it's not streaming, right? So it's probably happening that entire time. And there you go. We just learned how to work with the REST API. Hey, this is Andrew. And so in this video, in the open rest one, we ended up implementing um the REST API directly. I forgot that they actually do have a SDK called Open AI. I guess I got confused because there's OpenAI agents codecs and

it's been a while since I directly used it and so I did not realize I thought the uh SDK agents was what I was using before but there actually is um just one that wraps the uh REST API and so we should go ahead and use that one. So I'm going to go ahead here and just say um I mean we did open a arrest but we'll say MKDir uh open AI SDK here and we'll go into here. Yeah, you can't remember everything here, folks. It's just there's too much going on. And so I'll go ahead and go enter in this and I'll just say using the official Open AI SDK, I would like to demonstrate uh calling

uh a model like image generation. Okay. And now I think about it, this is the primary way that um I was working with it before. But uh obviously like if you use agents, it's going to be way easier to build out uh agent interactions like tools and things like that. Whereas with this one, you'll have to do a lot of the work uh passing things along. But we will go ahead and get this generated out here. Um I made a new folder here over to here. Hopefully I seed into that correct directory. Okay, I kind of forget if I did or not, but we will give it a moment to generate.

Okay, we are back. Let's take a look at what we have generated. And so it's fine what it is here. It's probably using the TypeScript library. Um, it did add a node module. So that was really, really nice. And so I want to go into our examples. MJS. What the heck is an MJS file? I don't know what an MJS file is. What is an MJS file? I've never heard of an M MJS file. It is a JavaScript source designed for ECMA standard. Why did it do this of all things? Um, OpenAI, sure.

I mean, I guess as long as it works, but like it looks just like JavaScript though. Um, but anyway, here we would be generating out an image. Does it specify what it used? GPT1. And so often the way this works is that you import it via OpenAI. It's going to be client images generate, right? But it's very, very low level. So, what we'll do is we'll back out here. And I believe I already have a key. I'm just pulling the same key, folks. Uh I show you in other videos, so I'm not going to show it again. It probably already is set here, but I'm going to just set it again out of habit. And so, let's go ahead and do we'll uh mpm install first.

MPM run start or mpm start. And let's see if that works. that's not there. Let's go over here and they call it generate image. So, for some reason they got really create creative here. So, we'll say run generate image and we will Whoops. We will generate out that image. And so, hopefully it's going to say where it places it. If we go back over to the code, uh I mean it's going to export a base B 64 and we have an output file here. So, it's generated image and so it should drop it right in this directory. So, we'll give it a little bit of time here to generate. It says that it's done.

We'll give this a refresh. Here we have an output. There's our image. Beautiful. What is the size of this image? I'm curious. It is I just don't want to commit a very large file to my repo here. So, I'm just curious how large it is. Properties. It is 1.62 megabytes. Beautiful photo. Um, too large for my taste. So, we'll go ahead and we'll delete it. But I like it. And so there you go. That is the OpenAI SDK. All right. Chow chia.

Let's take a look here at OpenAI agents SDK which allows you to build agents using Python or TypeScript. Uh and so here's the example of the TypeScript one. So we are defining what an agent is. uh and then we are calling it with a name and instructions. We wait for the results. Very straightforward. Um and there's many uh features within the agents SDK. So we got the agent loop, guardrails, tool calls, MCP tool cars, calls, sessions, human in the loop mechanisms, tracing, real-time agents. Um the thing that is confusing is that there's also the codeex SDK which serves a different purpose. So for whatever reason um uh OpenAI just has separated this out whereas when you use claude

claude has um their own claude agents SDK and so it's like a combination of the codeex SDK and the OpenAI agents SDK. So I need to make it very clear of these separate layers and then also the fact that OpenAI does not have a official uh REST API wrapper one. Okay, so just distinguishing between that and it gets confusing because they have another layer which is the app server, but we'll look at that in just a moment. Hey folks, it's Andrew and in this video we are going to go ahead and create a very simple agent. So simple agent using the OpenAI um the OpenAI agents SDK. I'm having this problem where sometimes my bash terminal just becomes

unresponsive. So there's a few tricks to that. Opening a new one here. So we will cd into simple agent. I don't know what hangs it up. I couldn't tell you why. Um but we'll go ahead and type in codeex here. And uh we'll just say using typescript can you create me a simple agent using the open AI agents SDK. So we'll let it go ahead and do that. I believe it's also available in Python. Um but open a AI agents SDK should be good enough. And so we will just wait a moment here. Okay. All right. It seems to be back. And so, we'll go over into our simple agent. And we got a lot of code here. The only thing that's kind of frustrating is that it doesn't generate out a git ignore.

So, go ahead and do that. And I will just say node modules as node modules ends up being quite the headache for us. But I'm going to go ahead and stop that here. and we will go mpm install. Once you learn a few languages, it becomes very easy. Um, and so we're going to scroll on down here. Okay. And we're going to take a look. So we have simple TypeScript agent. We're importing the agents here. Um, we're supposed to have the open API key, which is fine. Um, and here it's asking about current time and date. And here we can specify the OpenAI model. I kind of wish it defaulted to a specific model.

Okay, I go back here. Can you make the agent default to uh GPT Mini? I probably should tell it what version of it, otherwise it's going to get it wrong. But uh we'll just let it do that there in a moment. And what I'm going to do is go get go ahead and get my API key. I was using this for another project. So, I've shown you how to generate API keys quite a few times here. Um, I'm not going to show you again in this video. So, go back to the API key generation part which will show you how to do it. Um, and here's talking about five mini. Uh, it doesn't sound right.

No, it's Well, I guess we'll find out. Maybe it will work. We'll see. Okay. And so now what I'm going to do, I'm going to go export here. Open AI API key equals this. And then um what we will do here is run it. So we'll say mpm run watch or sorry mpm uh start. That's usually the standard thing. These are all over here in this. So if you go here, npm start, it's always mpm start. And so I'm expecting it to output time and date here. So it says I'm an assistant here. I give concise stuff. Tools I can use. So current time parallel. What would you like to do? And so there we go. So it's responding back.

Uh which is good. Let's go take a look at the source code here. And here you can see we have instructions. We have our model. We have the tool, a custom tool that we defined which is really cool. Um, and the only thing is that there's no continuous loop here. So, we go down below, introduce yourself in two short uh sentences, and it's supposed to wait. So, I'm going to go over to here. um the agent is um immediately exiting and so we aren't getting uh a feedback loop from the user input.

Okay, so we'll let that fix that. Let's go back and take a look here. So it says I've updated the readme document to the new behavior. I don't need that. I need it to actually change the functionality. And now it's accepting um read lines. So I can see here in the code that um it's going to do that and then run a turn. So pretty clear. So we'll go back over to here and we will start this. We will give it a moment here. Mhm. It's just doing that initial prompt here. So can you tell me the time in Toronto?

Okay. And so it's going to go ahead and try to use that tool. Not the best code, but it's clear, right? The other thing is that we have this get current time function, but uh how would it know where to call? Right? So, normally there'd be like a function defined somewhere here. It's kind of what I'm looking for. Unless there's already a function called get current time. Oh, over here it says now date. Okay. So, it doesn't matter where you are.

It's always going to do this. So, see how it says your host time is this. So, I say, you know, give me time in Calgary. So, it doesn't know where we are. So, this is just going to get on our running local machine. So, I can't say Toronto, Calgary. That's not going to make any sense. Um, but anyway, it's pretty straightforward, but you can see you get a lot of batteries out of the box here. I could literally make this into a twohour video, but I figured we'll just stop it here. And there you go. Okay, so we now have the Codex SDK and this allows you to run codecs programmatically using TypeScript. Um, I don't know if there's a Python library.

I think there's just a TypeScript library. I don't understand why um you know we're in the uh age of being able to generate anything and then we have limited SDKs um but you know this is what they're doing but over here we can see the TypeScript SDK so we are defining codecs we are setting up the thread and we're asking it to make a result so what can codeex SDK do it provides file system access code editing command execution test running repo awareness and so codeex SDK helps you build coding harnesses where Codex can be used along alongside with the OpenAI SDK because they serve different purposes, right? And I know it's confusing, but we will uh keep working

through this. Okay. Hey, this is Andrew. In this video, we're going to take a look at implementing the Codeex SDK. So, we'll go into a new folder uh and we'll just see what we can generate it out. I mean, like, of course, you can do this yourself. Um, but the reason I uh do this along with you is that in case something doesn't work, we can fix it together. And so I keep typing that. I figured I made it already, but I didn't. And so we'll cd into that and we'll open

up this codeex directory. Okay. And so we'll just say um using can you give us an example of the codeex SDK um using the official library in this folder. We'll go ahead and let it generate out and wait a moment. Okay. We are back and it says uh we have a minimal example running. Let's go take a look at the code for codeex. Uh, so we give this a nice refresh here and there's my SDK folder or examples. Now it's following this MJS pattern. Sure, it's fine. I mean, like again, I'm just used to it being JS or DS, but that's totally fine. So, here we have OpenAI. That does not look right correct to me. So, I'm going to go over and look up Codeex SDK.

Okay. And we will take a look here. So this is the codeex SDK. It does not understand what we are asking for. So we'll go down here. It's using the GPT codeex. All right. So look, you didn't use the uh codeex SDK. Okay, that's what we wanted. So we will let it go correct its mistake. Uhhuh. And we'll come back and see what we get. Okay. So we will um take a look at the code. Actually, before we do that, I'm just going to set the API key here. So, I'm just going to stop it and we will um set it. So, we'll say export codeex or not codeex uh open AI API key equals this and hit enter. Let's go take a look at what code we have now. So, now

we have the open SDK. So, you're helping uh with a JavaScript repository. Review the codebase structures. Return a short title with three bullet points. So, here we have codeex. It starts a thread. It runs the task. very similar to the OpenAI uh agents. Um but you know, again, the documentation says one has less than the other one, but like does it though? Again, I just don't trust the docs from OpenAI because I've been finding things that are um not matching exactly. So, but anyway, that's fine. We will just continue on here. And I really think it'll take deeper uh deeper time working with these tools to really know. But we'll go ahead and we'll run this and see if that works.

Again, we're just doing sanity checks, making sure that this stuff works as expected, right? Um, and so here, right off the bat, we're getting information back. And so it works. So there you go. Hey, it's Andrew. And in this video, I want to take a look at the codeex um application. So, we're going to go over here and type in codeex. And cursor wants us to pay for it. Sorry, we're going for Codeex here. And we're looking for the Codeex app. So, I'm going to go ahead and download it. Um, it should be available on all platforms. I actually already have it installed, but I'm just doing this again. And as you can see, it's pulling from the Microsoft Store.

If you are on a Mac, the process will be a little bit different. Is it on Linux? I'm not sure. We'll go ahead and open up Codeex or Codeex. Um, oh the, uh, I guess it's the codeex app. Yep. And so over here we have it. And you'll notice that we can, um, load in repositories or folders. So what I'm going to do is go ahead and add a new project. And I'm doing this off screen, but I'm going to go and navigate to my um, uh, sites directory to the Codus examples and open it up here. Um, this is not how I would want to work on an application, but it's definitely a way that you can work. Uh this is obviously for folks that are um less technical but I think that to get the

best leverage out of code actually should be working Visual Studio Code using the SDKs which we didn't go super deep into but right off the bat you see we have some options. So build a classic snake game uh populates it here. We have our options over here. We can change our effort. Uh we can also speak to it which might be fun. Um but we're going to go ahead and hit send. The only issue with this play button or this um speak button, it might not be able to work while I'm also recording here. So, I might not be able to demo it, but you can see we have thread conversations here. We can go over to plugins and there's tons of plugins that we can bring in. So, here's an interesting one.

Game Studio. Designing a browser game that plays in a core loop. That sounds really cool. Let's add that to Codeex and we'll install Game Studio. These are just plugins. Nothing super complicated, but they're just basically adding in skills. And actually, right now it's building a snake. I'm going to stop this uh for a moment here. And so now that we have these plugins, you know, we have all these skills. That might be something that I might be interested in trying out. I don't know if I can make this larger. Oh, I can. Excellent. So, I just did a plus there. And we'll go back over to here. I'm going to go ahead and look

for my skills. Can I see my skills? All my skills. How do I see all my skills in here? That's where I say that I kind of prefer the um uh the other way we're working here because we can easily see what we have. But here we have game play test game studio game front end uh 3D asset stuff. Sure. And we'll go back over to here just going to start a new conversation. I try this again because now it might pick up those skills, right? Normally, you'd have to do dollar sign and say like 3D, whatever it is, uh game creator, whatever it was. It's not showing up as well, though. And so, what

I'm hoping is that it would pick up those skills. It's not picking up those Let's go over and take a look at automation. So, this would be for probably setting background tasks and things like that. So, automate workflows by scheduled threads. Yeah. So, here we have a bunch of different ones that we could implement. So, here we have yesterday's activities and we could set when it is. I don't personally want to set this up, so I'm not going to do that. Um, but anyway, you can see what it is. There's nothing super exciting about um this. Again, I think it's the very primitive way to work with codecs, and I would not recommend this method. I would only do this if like you didn't

know how to get Visual Studio Code installed, but if you can't do that, you probably aren't taking this course. Um, I'd be a different course for people, but I'm going to just stop this here. But we have an idea of codeex up here. You can get the terminal, right? Nothing exciting. Okay, chiao. Hey, it's Andrew. In this video, we are going to go ahead and install the codeex um uh extension for Visual Studio Code. So, just go ahead and type in Codex here. I actually already recorded this, but I lost the video. So, here I'm just going to uninstall it really quickly, but you're going to go ahead and install it. And once you install it, in the top right corner, we'll have codeex here. You can

see I already have a conversation. I'll make a new conver conversation and I'm going to just go into hello world.rb here, take a screenshot, drop it in here and say where is this code in this repo. It's pretty darn straightforward. Um, we could go up to our settings if we want to install plugins. We could search for plugins here uh or skills. So, here we have Sora. I think they're not doing Sora anymore, so that one might be gone. But, you know, maybe you want a skill to create edit spreadsheets. So you can add that skill there. Uh the only thing is I don't think these commands work exactly the same way. So I go ahead and type in skills.

Stop bringing up that command. Say agents, you know. So a little bit limited in terms of the way it can work here. I probably wouldn't recommend using this because you know the CLI is so powerful. The only that's kind of a pain is trying to figure out how to paste images into WSL 2. Not a problem on a Mac, but definitely a problem here on Windows. Um, but anyway, that's the codeex plugin. Nothing to write home about, but there you go. Okay.

All right. Let's talk about the Codex app server. This is an interface uh used by Codex to power their rich clients. And so, if you want deep integration, this is what you're going to use. Um, it implements this birectional communication using JSON RPC, which they call app server protocol. Uh and so if you're using codeex SDK, there's a lot of stuff that you would have to roll uh roll your own. But with the Codex app server, you're going to get a lot of additional functionality. This is confusing if you come from cloud code as they have the um agents SDK which has all this functionality, but for whatever reason, it's separated out into the codeex um the Codex SDK, the Codex app

server, and then you have the agents SDK. So they're all doing something a little bit different. They all can work together. Um, but anyway, the Codex app server here, here's an implementation of it. But the idea is that you would just start up your server. So, you're saying I'm starting up a Codex app server. All right? Or you can say listen with websockets. And then the idea here is that you can just send it um uh information. So, you're just saying I'm going to send the method initialization with these parameters as it's using its own protocol to communicate. And so, here you don't need to know um you know complex SDKs. You're literally just sending messages back and forth. Um and

so that is the approach that they took. Okay, we can use GitHub actions uh along with codecs. And so we saw earlier with the non-interactive um uh mode that you could just, you know, put it into any kind of build server and utilize codecs, but they have their own uh built-in Codex GitHub action. And so you can just see here I'm just get my pen tool out here. Um that we can just call it direct directly provide it a prompt file an output file set it sandbox. Um so definitely easy way to work with codeex um in GitHub actions and you know you could have it to react to stuff and do stuff based on whether you have pull requests or comments or things like that. Um so yeah

there you go. Hey folks, this is Andrew. In this video, what I want to do is implement a uh use case with um codecs with GitHub. So, I'm going to go over and typing codeex uh GitHub actions as there's going to be some kind of GitHub action that we can utilize. And so, if we go over to here, this should be uh the open source one. So, I'm making sure that this is indeed the official one. It is over here. And so, I just need a simple use case. I'm going to copy this over to here. And we will cd back a directory. and we'll just say codeex um and we'll say I need a simple example to show off uh GitHub actions with codeex uh reference uh here.

Okay, so please create me a simple workflow and then we will verify that workflow in just a moment. Okay. So let's take a look at what we have here. Hold on here one second. Sorry. Yep. So we'll go up into GitHub action or GitHub folder workflow demo. Let's see what we have. So we have what codec should focus on in this repository. Give me a quick tour of the example repo. So it says workflow dispatch inputs focus. Uhhuh.

Okay. I'm not exactly sure. I mean folks I made a GitHub action score so you know I know the stuff. But like this usually is like I think a manual one, right? So here uh you're generating a short demo report for GitHub actions. Look through this repository and produce a concise report. Keep it clear. Export it out as this file. Um okay. So before we do that, I'm going to go ahead over to my repository here. or wherever well wherever it is up over here and I'm going to go into my settings and in here there should be a section for um secrets and we will go to actions and we will set this at the repo level and this will be open AI API key and we will bring in that key. So

I'm looking for where that key is. I still have it. Uh or maybe I don't anymore, which is fine. I can go make a new key. Making new keys is fun. So go here. I'll just delete this old key. Revoke. We will create ourselves a new key. GitHub key. And we will create that key. Then we will copy it. We will go over to here. We will paste it in. We will hit add secret. Okay. So now we have the secret for this repository which is the codeex example repository. I don't know this is really the right approach but I'm going to do this anyway. So check out the repository run codeex

and then generate report workflow dispatch inputs. I hover over will tell me how it works. You can create workflows run workflow button. So that's what I was expecting this to be. Okay. So what we'll do is we'll now go ahead add workflow and we will now push this here. I usually don't do manual workflows. And I mean like it probably more practical to have it like react to content and stuff like that, but it's fine. If you took my um Claude course, you'll already have an idea of how that stuff works. And so I'm going over to here and we

will give it a refresh. And so it is in here. We will go to actions and I'm looking for where it Oh, here it is. Codex demo. And so we can run this workflow. Oh, cool. You can even change the prompt. So you can ask questions about it. That's kind of cool. And we'll run it and we'll see what results we get back. Okay. All right. So we're back and let's take a look here. And so here we can see three folders suggested next demo to run. Um, and so it generated out. So that's what we wanted to do. I wonder if it actually placed it in the repo because it might have dropped it right in here.

Um, does it just generate it in place? I mean, I'm fine with that. I don't really necessarily need it anywhere else, but it is right here. Let me go down here for a second. It says codeex demo summary. And then, oh, it just prints it out. Okay. So, I'm thinking it didn't commit it to the repository and it's just printing it out here. Cool. And so, we go back over to our top layer here. Yeah. And I mean, that's fine. So, it did exactly what we needed it to do. And there you go. Okay.

Agent skills are a lightweight open format for extending AI agents. This is an open format uh that was created. I said we created yes I created with anthropic but it was created by Anthropic um in 2025. Okay. So this is the directory structure that you're going to see. You're going to name of the skill right here. It's going to have a skill MD. It may have scripts. It may have references and it may have assets. So skills use progressive disclosure to manage context efficiently. If you ask me what that means, it's word I forget and it shows up many times in marketing. So it's some kind of marketing jargon um that Claude likes to or anthropic likes to say a

lot. Um I think it means like progressive disclosure. It means like only load when you need it, right? So you know when you use claw m uh clawed MD files and then we had file other files in other locations um it would only load it when it needed to be and doesn't need to know everything. We also see this term progressive disclosure when we talk about MCP as well but anyway that's a side note. Um so um anyway back to this there's three things we got discovery. So at startup agents load only the name and description of each skill available just enough so you uh might know when it will be relevant. So what I just said activation when a task matches a skill description the

agent reads the full skill MD instructions into context. Execution the agent follows instructions optionally loading reference files or executing bundle code as needed. So yeah just what I said. Let's take a look at the anatomy of a skills MD. So you have your front matter and you have to have a name and description. That's all you need to have the skill, but um you probably want to have actual instructions because it'll be a lot better. And then the only thing you need to have is that skillmd. Everything else, the scripts and stuff are optional. Um something that you might also want to specify is allowed tools that is in that script. So here we have a lot more information. You can see

we have license compatibility, metadata, but allowed tools I think is one we're going to care about. Okay. Um, but yeah, that is the most basic information about an agent skill. Okay, so codec skills are agent skills. They follow the agent skills open format. This is very similar to Claude, but they're going to be a little bit different as it'll have some of its own special customization. Skills are just specialized workflows trained to work in very specific ways. As I just said, Codex does extend the standard. Not by much, but there is some differences, and we'll cover that in the next slide. If you want to see a list of skills that

are available to you, you type in for/skills and you can see the built-in ones there. Notice that one in blue is actually one that I built myself. So, with Claude, when you type for/skills, it doesn't show you everything. It just shows you uh one or the other. Can't remember. But the point here is they show you everything here. That's nice. And notice there's a dollar sign and that's intentional as skills start with the dollar sign name when you are typing them in to the prompt when you want to invoke them. So if you type in dollar sign, it will start autocompleting based on available skills. Um and because you're using codec skills or agent skills, uh if you don't directly say I

want to use this skill, it will just um in indirectly consider it if the description is matching for that skill. Okay. So over here we have our extension in terms of what codec skill adds. So it adds this openi.yamel file in the agents directory. Um just make note that agents directory is going to vary. I think most of the times it has a period in front of it. We'll find out in actual execution when we go make our own custom scale. And I do believe there was some confusion in terms of the naming as the docs are not accurate. So in the

lab, we definitely solve that issue, but I believe it's just agents. Um but anyway, so you're going to see that they have some interface stuff in here and then you can change the invocation policy and dependency tools. Um so that's basically it. Um, Claude does a lot more to extend this um, uh, this format, but this is still fine for Codeex. Um, you can save skills and load them from multiple places. Notice that they're going into the agents directory and then it's skills. Okay, so it can be from your current working directory. It can be above it. It can be in the topmost place of your repo. It can be your home directory and then there's like a one that's completely on your

machine. And there's probably another option. So, there's a lot of places where these can get loaded in. I think for the most part, you're going to be using option one or option three here. Um, but uh, yeah, we do have, I think, some challenges with the name of the directory there. And it's either with a dot or not a dot, but we'll we'll solve that in the actual custom agent lab, and you'll get that experience hands-on. Okay. Hey, this is Andrew. In this video, I just want to show you the general skills that we have available to us. Uh, and so I'm going to go into maybe our hello world here and we're going to open up codeex and let's just take a look at

what possible skills we have. So if we type for/skills, we can get a list of current skills we have. And if I list out skills, we can see them. Um, and I actually created this will be in a separate video, but I created a custom skill. If I was in this repo, it would show up there. So if we create our skills, we should always check for/skills to see if they're showing up. But notice we have open eye docs. um which we probably could have been using sooner since we tend to ask it quite a bit to reference the docs. Here we have deploy model. So I guess that's an integration with a Azure and Microsoft Foundry and yeah not a whole lot of stuff here

but it is interesting we have a generate image images for website games or more. So that might be fun to do and so what I'm going to do here is I'll make a new directory. So, just go back a directory here. And actually, we can go into our Wolfenstein game. Okay. And I'm going to go to codeex here. I'm going to say, uh, you know, can you generate an image that we can use in our Wolfenstein game? Okay. And we might need to do this in two paths. We might have said like, what assets do we need? and then uses to generate that out. Um but we will give that a moment and see what it comes back with. So I'm checking the project. Okay,

so it is doing that. So it says the fastest fallback which needs the open AAI set. So if you want I can generate a project ready PNG and save it in this repo for example given the current codebase. Yeah please go ahead. I think if I ever used an image it would fall back to a skill but we will see if we can generate some stuff out. I can't complete the generation because I don't have it set. So, what I'll do, I'm going to go over to uh OpenAI because we do have a key and um sorry, OpenAI developers.

Just trying to make my way over to the um the platform here. And I'm going to go into API keys. And I'm in basics. I'll create a new one. This will be Wolfenstein graphics or like uh 3D images or image gener. Sure, why not? I'm going to go ahead and create that key. Copy that well, I need to close this out first. I'm going to do this. Paste that in like that. Copy my key. Okay. Paste it in as such. Okay. And I'm going to go back over to here. Run it and say uh we'll go resume. Uh the key should I set the envir try again. We'll see if it can pick it up that way. Normally you'd pass it

along with the program when you start, but I'm hoping that it just kind of picks that up. And so technically, we are using your subscription, but it also will use the key. It printed the key out here. So, it knows exactly where the key is, which is really, really interesting. So, we are still on our subscription. Um, I'm going to generate it out. So, it literally looks like it pulled it out of our end bars and is directly using that key. And we'll say, uh, yes, you're allowed to do that. And so now we're just waiting for that generation. And we will wait a moment.

It's saying the bundle CLI is failing against OpenAI. I'm checking the expected API service. So, it looks like I have an old version of OpenAI installed. So, I would have thought that it would use HTTP request directly, but it's interesting that it's invoking it that way. Um, so I'm not really sure what's going in the image gen. Maybe they're just helping you with the image prompts like uh because it seems like it's still invoking manual endpoints and using SDKs. Okay. And now we're just waiting for that generation. And we can see it running in the background there. Uh, this might be our perfect opportunity to type in PS. So, PS allows us to see background tasks.

It's like the only opportunity I ever got to show that. Um, yeah, PS is for background tasks, right? And even still, I can go like PS right now. And we can see this is a background terminal that's running. I'll probably put in the message PS so that we know about it because I'm never going to have another opportunity to show uh processes. And it says it's now written the images. And now it's doing a sanity check. And we will hang tight here. And now it's showing those final prompts. I've I've upgraded the local OpenAI package for so the bundler can do whatever. That's cool. I can generate a matching wall text and enemy now.

Um, did it do it? So I don't think it generated it out. Let's go back over to here. Source disc assets. Yeah. So you know proceed to generate images and maybe it'll be smart enough to cue those up as separate tasks or processes. We'll see. And we'll see if it does that. If it was smart would do that. So, I'm generating a small starter set of usable assets. I'll batch them into the output img genen. So, you have a coherent set uh instead of one-off art. Sounds good to me.

I was not expecting Wolfenstein to be this easy to make to be honest. We'll go ahead and say yes and proceed. Is passing the prompting information uh or the prompt for the image generation. So, that is going. We'll give it a moment here. Okay, the images have generated out. Let's go take a look here. And I'm going to go into where did it go? It says image gen output. So still going generated set. Take a look here. Oh, it's something portrait icon wall stone. Wow.

Well, maybe not that one. That one's not very good. The question is going to incorporate them though. It is limited in terms of its generation, right? Because it's very fixed. But we will wait here a little longer. And so now, let's see. The strongest immediate use assets are the wall textures. That's true. Let's take a look at what else we have. Yeah, these are fine. I mean, the guard photos aren't very good. I also upgrade the local Python. We know that. So, if you want the next steps, integrate them into the game. Generate the matching um

generate matching door floor. I would say, you know, focus on to generate uh we don't need enemy variants, but we need a floor. I don't think the game has doors yet. So, we'll do that. And so, it should already know to utilize the image gen or have enough uh context information to do that. Uh, and then we'll integrate them and then we'll see the results. So, I'm generating the matching floor and door art rather than enemies. Okay, great. So, I didn't think there was door logic, but if it's there, let's have a go at it. So, it's explaining what it is. Again, this is not great. Again, I'm assuming this we're going to see that as the enemy, and if it's not great, we

will adjust it. Our usage has been pretty good so far. Uh, but we'll hang tight here. Let's take a look here. So, the game has doors already. And so, now it's generated out those images. And so, we have this door, that door. I really don't see that working very well, but I mean like the floor and stuff will work, right? So, okay. Um, so now let's uh let's implement actually in the game. So, they show up. Okay. Yeah. I mean, it's not going to know. It's not going to be the best at this, but like how's that going to work? And how's that going to work? Yeah, I don't know about that. How's that going to work? Right.

It's still cool that we can quickly do this and get a prototype, but we will have to hang tight, right? Okay. So, it is now done and it will be fun to go give this game a try. So, I'm going to go over just to another directory. Um, and we'll go to Wolfenstein and we will run our build. Not sure if that's how we ran it before. I don't remember as it's been a while. So, going to go over to here, take a look.

Uh, do we have any instructions on how to serve it? Probably not. I'm going to do HP server. So, that'll start on localhost 880. We'll open the browser. Maybe it just runs here. I have this. Enter the bunker. Um, we'll go to here. Yeah, I'm not exactly sure. So, go back. Okay. Um, can you run the game? Because HB server should have worked. So I can test I don't the images are large. It shouldn't matter. npm rundev. So we'll go ahead and do that. It's just an

alternative way of running it. And so we'll open this up. There we go. I'm not exactly sure why the other one didn't work, but our graphics are in. How cool is that? So, there's some things that are missing like whatever that is down below. Oh, it's our gun. Right. So, it says shots fired. Whoa. We got some stretching going on here. Why is this wall green? And they don't seem to die or vanish, I should say. There we go. I wasn't clicking uh shoot. There we go. And I

don't I don't It doesn't show any health here. Oh my goodness. It's weird that when we roll into it, it does that weird stretching part. There we go. Why do we have so many doors here? So, we're getting definitely more distortion and some weird stuff. But the point is that it still is kind of working. So, again, this isn't a go make an entire course on um on Wolfenstein. Why is there so many doors? What weird logic is this? But it's interesting that it works.

Okay. So, yeah, we have achieved our example working with skills. We'll call that done. Okay. Hey, this is Andrew. In this video, we're going to take a look at skills. Um, and so what we're going to do is make a new folder. And this will be um something that we need to uh create a skill for. I like using data. So, we'll just say like um simple to-do uh data and we'll go into here. I'm going to launch up codeex and what I want to do is create maybe a JSON structure file. So, create me a JSON file to manage uh my um project management. Uh I need mock data

uh mock data for a single JSON file. Okay, I'm going to put this into plan mode just so that it'll ask me some questions. And the idea here is that we'll generate it out and then we'll create a skill that can go and manage uh that list there. Okay. And apparently do they do have a built-in creator uh that we can utilize to help build our skill. Um but we'll just give it a moment here to get established and then we will move forward here. Okay. Oh, here are questions. So, uh one JSON file with projects tasks. Yes. Um, shape the data like realistic app. No, shape the data more like normalized C data.

Um, I would just maybe more like C data. Sure. You're going to have different options than me. We're not going to have the same thing. Great. It's looking good so far. And we'll say implement switch to default mode. And so we just want that one uh file created. Okay. All right. We are back. Let's take a look at our actual data here and see what we have. So give us a nice refresh here. And so we have uh data, right? I'm not that worried about the data. I just want to build a skill. And I find skills are really useful if you know manipulating something like data. That's

what I like. Let's go ahead and we'll type in skill. And notice I had to put a dollar sign in here to get all the stuff. I didn't realize that dollar sign was our skill activator, but we have a bunch here. So, we'll go ahead and type in skill um creator. Okay. And we'll say uh we need to create a skill that will let us add uh modify uh find or up um delete tasks in our project manager data file. Okay. So we'll go ahead and enter that in. And so it should create um that skills directory and that skill.md. And then we will take a look at what we have. Okay. So here it's starting to use the skills creator and it's checking uh for its helper scripts so it can scaffold it. So

that's good. Where should I create this? Um no this is a project specific skill. So create it in the current folder. Okay. Otherwise if it puts it there that's going to be a globally accessible skill. That's not what we want, right? So we'll give it a moment there to figure that out. or it's been taking quite a long time to write the skill, but let's go take a look and see what it has currently. And so in here we have project task JSON manager. Um, shouldn't it be in a skills folder?

Mhm. Just a moment here. Well, how's it going to detect that? Let me just double check. All right. And so here it's saying that uh there's well, obviously there's multiple places where you can place it. There's agent/skills, but there's also just for/kills in the codeex directory. I thought that's what it would do. But here it says admin. So any skill checked into the machine or container in the system used for SD scripts automation for checking default admin skills available to each user on the machine. Uh sure I mean like I'm in the repo, right? Oh, but this is at etc codec skills. And so here it's expecting agent skills very inconsistent uh folder

naming schema. And so here it's validating it and it's running it manually which is silly. Let's go ahead and type in skills. And I want to take a look here and list the skills. And so what I'm looking for is my project management one. I don't see it here. And maybe we'd have to restart the context. But I'm going to go ahead here and make a new folder called Well, hold on. We have agents right here. And I'll go here and just say, you know, skilluer, skill creator. Uh, you created me a skill, but the docs say you are supposed to Oh my goodness. We'll just go ahead and just do it ourselves. Let's just test it. This is ridiculous that it did not uh create the folder correctly. Um, and

so we will go into here and we'll make a new folder. And this will be agents. And then we'll go ahead and say skills. And then we'll drag this into here and we'll move it. Totally ridiculous that we have to do it that way. But we'll go ahead and run this and we'll type in skills and we'll say list skills. And so I'm looking for and there now it shows up, right? So it's really interesting that the codec skill builder does not work correctly, but that's okay. So now that we have it, I want to go invoke the skill. And the way skills invoke is based on its description. So here it says manage the task records. Use codeex to add, update, find, filter tasks. Use a skill to make

deterministic changes. And then here we have helper runner scripts. I'm not sure why we need all that. Uh, sure. Explains the task schema. So that's not a bad idea for references. And then we have the manage tasks. So here it knows the structure and so it's utilizing us to parse and return and look at information. Sure. And then we go in here into Open AI. Uhhuh. Sure. Okay. And we'll go over to here. I guess that's to expose it so we can utilize it over here. And we'll go over to the top here. So, um, let's go ahead and book it. So, we'll say project

task manager. Can I get a list of my tasks? Not that we would need that to do it, but we'll go ahead and just do that there. I actually needed a um skill for like I have uh I'm learning Japanese and I have these transcripts and I need to adjust them so they perfectly align up and sometimes the speaker is wrong, the place is wrong and stuff like that. It needs complex logic to figure it out and so I have a skill for that but this is um obviously simpler than this. So here we'll uh wait for it to come back not the helper script. So I'm reading the task records directly.

didn't even use the uh thing. I've got the user map next for uh next key is formatting it. Mhm. And so now we are getting a tasks back. It is working. So we can't complain. It does work. And I'm going to go ahead and do this. And so we'll say we have a bunch of tasks here. and um as done. And what I'm trying to observe here is it actually using the operations within the skill or is it um just going out here at the very least we know that it is using a prompt and it is being contextualized based on this information. The only thing is that like these locations are very um explicit as

to where they are and maybe it doesn't know where it is relative to here. And mostly it's just using jq. So it's not using the manage task.py. and then it's editing it directly. So that's not working exactly. We wanted to use the manage tasks. So I'm going to go back here and we're going to say skill creator, you created a uh the skill is not using the manage task pi. It just seems to be reading and updating directly. Can we force the skill to always use the manage tasks pi so that it um ensures data consistency.

Okay, because right here I'm looking for it and it's not executing it right. So we'll go ahead here and hit enter. I mean there's one skill. I didn't specify which skill, but we'll see if it can pick that up. Yep. Um, so here we have information updates. Let's take a look here and see what it did. So I'm adding the skill now that changes it fixes the command path to a non-negotiable never edit JSON directly. Um, so use the skill to determine whatever. Um, in here we are trying to find add find. So it's giving better examples. Resolve the helper path. So,

it's updated the pathing and it has um update examples and it just keeps iterating. Hey, do not manually edit this stuff. So, we'll go ahead and we'll say yes, proceed. Okay, the only thing we don't have here is like a list, a task list. So, I'm just going here and I mean we should be able to filter and find. So, find matching tasks, right? Okay, so the validation passed, but the smoke test uh exposed the actual blocker manage test computes the default data path incorrectly. And so we'll go ahead and let it update that information. And now it's back. So it says it changed

the scale, uses the correct helper path, always use the script, uh fixed the uh typo in there, and then we have a quick validate. And so now we should be able to edit things. And so I'm just going to kill it and go back and do it. I don't know if that really needs to happen, but I'm going to do it anyway. We'll do dollar sign. And here we'll do project task manager. And we'll say um uh show me only or find me the uh medium level tasks, right? And so here I wanted to invoke this script that it created. And look, now it's using it, right? Okay. And it comes back and formats it.

Nice. There we go. Okay. And this is better because then it's reducing the computational need of the agent, but also we're also restricting how it should interact with it. And they'll go say um you know, can we change task uh to be done? Okay. And here uh you know, we didn't specifically say it, but we're hoping that invokes the task. And it is. It looks like it is anyway. And so now it's updated that task. Okay, relist out the recent uh found tasks. And again, I'm going to see if it invokes it. I think it's invoking the task. It's just hard to tell because here we explicitly used it, right? But I believe that it is routing and it is using it

correctly. Uh and there you go. So that's skills and agent skills is a um standardized format. So technically you could take this and move it over to cloud code. The only thing that's a little bit unusual here is the agents open AI. I'm not sure why we need this. Um but I'm sure if I go update the slides we'll explain that. Okay. Hey folks, this is Andrew and in this video I want to show you how to change your model, how to change your effort.

Very straightforward. We'll go ahead and type in codeex. And then if we type in model, we can now choose our model. So I'll go to mini. And then we have other ones. So we have low, medium, high, extra high. So the more effort you give it, the larger the context window is. I'm going to go to low. And we'll go ahead and just say uh you know, can you um add another puts um to the hello world file? Okay. And the idea is that this is going to save on your computation down below. If you're doing really simple things, you want to keep it the effort low. You want to use a model that is appropriate. And there you go. That was super fast. Uh I barely noticed any kind of change of use there.

Um and so I was just to quit that out there. You can see the token usage is very small. So very small. Um but yeah, not much to say there. Very straightforward. I'll see you in the next one. Okay. Chowo chia. Hey, this is Andrew. Let's take a look if we can use the fast feature. So, I'm going to go ahead here and type in fast. So, fast mode toggles fastest inference for the two times planned usage. And so, the idea is that when you use this, it will be two times faster uh than it would normally would be. So, I'm in my Hello World app right now. That's what we uh ran this on. I'm going to go ahead and say, can you update my hello world so that it has uh

20 strings in an array and loops and prints them out. So, we'll go ahead and hit enter and we're going to get some two times speed. It's interesting. My usage is just uh recycled there. It was um not 100% just a moment ago. And oh, is it already done? And so that was pretty darn fast. Um I'm not sure what model size I'm on right now. I might be on mini. I'm on mini and I'm on um the lowest reasoning. And so as you can see that is super fast. And it did it for both. So you know if you're in a hurry you can use fast mode.

I'm going to toggle that off. But that was fun. Okay. Cha chia. Hey, this is Andrew. In this video, we are going to take a look at plugins. So, plugins allow us to bring in a bunch of stuff uh as a package. So, we'll go ahead and type in plugin. And so, here we can see curated plugins available. So, we have linear, Netflix, notion, century, slack, stripe, all sorts of stuff. U we have game studio, which is just a series of plugins. I want to go ahead and take a look at what game studio is. So, I'm just off screen here. Here, I'm waiting for my browser to be responsive. I'm going to type in game studio uh plugins. I'm again I'm just typing this into Google um

codeex. And so what I'm looking for is if it will tell me what's in this plugin, right? Because it should be on GitHub somewhere. I would expect it to be. So we go back here. Let's type in GitHub. It must be on GitHub because I believe it's one that Dave uh authored, right? Says Roblox video, which actually is not a bad idea. My kid really likes Roblox and that'd be something that would be probably useful to uh give a go. Popular project for connecting AI agents to Roblox. Oh, but like to but to Roblox Studio, right? Um so I'm thinking about it, but I like I was hoping what I would find are the actual plugins. So, let's just say Codeex plug-in marketplace

because that's what I want, right? I want to be able to see where they come from and they're not making it easy. Uh, give me just a moment. Okay. All right. So, maybe this is my confusion. They're saying that there is no general repository for it, which is different than Claude. So, that's why I'm a little bit confused. Um, but you know, like the issue is like how am I using something that I don't I can't see the use of. So, what I'm going to do is make a new tab here. We're going to go explore our codeex directory at our home directory. Okay? Because this stuff has to go somewhere, right? Um, and so I'm going to go into skills.

We have develop. See, it's right there. I'm going to go ahead and open it up and we'll take a look at what we actually have here. So we have agents our agents open AAI file um and we learned about that in skills. So here this is defining the interface and so just a couple graphics. This is more for um the codeex app obviously. So this is just a scale that's all it really is. We'll go into here. We have a playwright client. Okay. And so playright is a way to control uh the browser. So this is a way that it could directly control it. Let's go into the skill.

I was expecting a bunch of skills. It's just a single skill. Okay. So use when Codex is building or iterating on a web game. It needs a reliable development testing loop. Install small changes. Run a playright test script. Uh build the game in incremental steps. Okay. So nothing super complicated here. I was expecting more. I thought it had more um skills because if we were over here, and this might be a video for later or from before, but if we go into our um plugins here, and we go into skills, we type in game, game studio. I was expecting a little bit more now over here. So here we have play test studio frontend game fibergame. We have a bunch of skills, a ton of skills. So where are

they? Because that's what I want to know. And we have one here. Develop web game. And that's just the one. So let's go back over to here. We'll cd back. Maybe they're just here in the directory. No, it's just the one. I mean, I didn't install the plugin. I installed the skill. Well, I thought it No, no, no. It's right here. So, let's go ahead and hit enter, I guess. We'll go ahead and hit enter. So, this is going to install the plugin and all skills.

Okay, so now it's installed. I'm going to go back over to here and I'm expecting more, right? Let's go system. No, no plugins. Uhhuh. Let's find that content. There we go. You can't hide from me because you want to know like you got to get to this stuff if you want to work with it. And so if you know where it is and you can muck with it. But let's open it up and take a look at what we have here. So there we go. There's all our skills. Um and so there's nothing really that complicated with it.

We have a few more uh steps here and then within them they might have their own. So we had game studio somewhere in here. What's interesting like is that the other one had play right which was individual skill. This one doesn't seem to have it. Okay. And so I just want to, you know, point out it's like it's nice that there are um, you know, these skills, but you really should build your own because like, you know, I know how to build phaser 2D games and I would not describe this as uh, particularly good. I'd be a lot more specific in terms of how I'd want it to work. And so, you know, just because there are these things here, they took two seconds to generate out. Um, and so

you'll get better results if you make your own. But anyway, um, that's where the plugins are. And I suppose we could look at how to make our own plugin. I'm not that interested in doing that. Uh making our own plugin, but I just wanted to show you that th those options are there. I'm just curious about linear because it does come up as a main integration. So I'm just wondering if we should explore that. Um as it is this thing here, product development system for teams. Um, do we care about this? Planning and building projects designed for the AI era. So, it's a design project management tool. I don't know.

I'm not sure if I'm that interested in it, but it does come up as a integration thing. So, build, plan, deploy. It kind of feels like just a way to control and watch um the project outcome. So, no, I don't think so. I think that we got what we wanted here, but obviously there is other ones. I'm just curious. So, let's go ahead and look at maybe um Gmail. We'll say install that plugin and we'll go back over to Oh, make sure we grab all of it. Install on Chat GPT. No, no, no. Uh continue without waiting for refresh. Install the required apps in chat to continue. Ah, so you have to install it over there.

Okay. I don't really want to do that though. So, I'm not exactly sure why we'd have to go over to Chat GPT for that. Um, but I was just hoping that it was a skill in here. But maybe that's the way it has a connection out to um to that. But again, we this is more of a developer focused um course, so we're not going to do that here. Um, and that would be something I would put in my from zero course where we do automations. But yeah, again, we have an idea of how plugins work. So that's good enough. Okay. Okay, this is kind of interesting, but if you want Codex to talk to agents SDK, uh one strategy is to uh launch Codeex as its own MCP server. And so you would

launch it up and then uh you could then use the agents SDK to communicate it as it's an MC MCP server. To me, I find this completely convoluted and complicated. Again, Claude just has the claw SDK, but for whatever reason, this is where they wanted to draw those boundaries. I cannot tell you as to why, but that's how it is. So, there you go. Okay. In this video, I'd like us to work with an MCP tool. I'm probably going to use Roblox as that's probably a good method for it. So, I'm going to go ahead and type in codecs. And in here, what I'm going to do is go into uh the

plugins because I think we saw one for Roblox. I'm going to type in Roblox. Nope. Um, so what I'm going to do instead is just go to the internet and type in Roblox. We will get Roblox if we want it. Roblox MCP codeex. And so we have one here. So here's a fellow that made it. And we're going to go ahead and add it. I like how it's it's available for all three, but we have Claude, Codex, and Gemini. And so I'm going to go ahead and copy this and we will stop and we will hit enter. And so now Codeex has been added or sorry um Roblox has been added as an MCP um environment. And so for this to work,

I'm going to have to get Roblox Studio installed. So you can just watch and learn or you can install it along with me. Maybe you hate Roblox, maybe you think it's interesting. There's good money to be made on Roblox. Um, and so I'm going to go ahead and just make sure that I have the studio installed. I actually might already have it installed because I was trying to show my kid how to code. Um, and they actually do have a built-in AI, but it's not the best. I can't show you, but I'm actually loading um studio offscreen here. It's just over here. And so now we're over here. I need to get logged in. So just give me a moment to log in. And I'm now in the studio. I'm going to just say back to

home. I'm going to go ahead and make myself a new place. And so, if you've never used Roblox before, it could be pretty confusing, but it's not really hard. You basically are pressing plus button here on the left right hand side quite a bit to add things. And there's this chat down below. And literally, I learned it like in an hour to show my kid. Um, so I'm going to go back over to here and I'm going to go ahead and say codeex. Well, I'm going to go I don't think it really matters where we um do this, but I'm going to make a new directory. I'm going call like Roblox uh Roblox fun.

Okay. And we're going to go ahead and type in codeex. And so I would suspect that if um we have MCP, we just type in formcp. It would tell us if it's connected or not. So here we have it. And we have um codeex apps, barer token tools, none resources, none, unsupported. That's not great. But here um we have a bunch of tools like add tag, capture screenshot, create build, create object. Um so we should be able to say something like add something to the world. So uh you know, I have Roblox open right now.

Can you uh Roblox Studio, can you use the MCP tool to add a object to the world? Okay. And let's go ahead and see if that works though. Normally, like with Claude, you check to see if you're actually connected or not. And so here it's not so clear. I can do that with the studio. I need one detail. What object should I add and where? Um add a square. Add a um a B. Let's go take a look what we can add because they do have different kinds of objects. Um, so we have parts. So we'll say a sphere. So add a sphere. Um, you know, uh, you know, like 10 out from origin. Okay, I'm just saying 10. I would assume that's something. So here, allow to do this. We'll say allow. Yes.

I've never done this before. Just trying it out. And so here it's doing Roblox.create object. I'm assuming it's making API calls um on our machine somehow. No, no idea how it's exactly working. We could probably look at the code. Could we look at the code? Actually, I'm not sure, but um we're going to wait here and see if it works. If it works, that'd be really cool. In um in our other video, I used Tvago. Uh, so we're not going to do Travaga this time. So allow to run get instance. We'll say allow. So it's uh studio plugin connection time. Okay. So maybe we need a plugin.

Let's go over to here. I didn't install any plugin. So uh there's no plugin tab here, but let's go back over to the Roblox um uh actual website here. I've kind of lost it now. I'm just looking for it. I don't see it. So, type Roblox MCP here. And so, there must be a plugin. So, install it. Connect the AI assistants like Claude and Gemini to the studio. Oh, install the studio plugin over here.

Okay, so I missed the first step and we need to download that first otherwise it's not going to work. Um, so we'll click here and so there it is. I'll go ahead and download that. Totally safe to install, right? And uh the file is downloaded. So I need to load it in. So I'm going to assume there's some way to bring it in. Let's see if I can just drag it in. Does that work? Uh, I mean, it's in here. I'm not sure if that's the correct way to bring it in. Plugins. We'll say manage plugins. Add.

Add. And so somewhere in here, we should be able to bring them in. So, we have creator store. No, I just You haven't installed any plugins. To find more plugins, open the marketplace. No, that's not how we want to do it. We just want to bring it in plugins folder. There we go. And so here's a folder right here. I'm going to drop this plugin like that. And so now we'll go back over to here and I'm going to close out the studio. Don't save. And we will reopen the studio here. And we'll see we can get this to work here. So, uh-huh. We'll go ahead here

and open a base template, base plate template. And now we have MCP server. Okay. And so there's one thing we do need to do. We have to allow Well, this will see if we're actually connected. Connect. But I think we have to allow um HTTP server. It said I keep losing um the window for the MCP server. Here we go. Back over to here. We'll click back. Back. So, enable allow setting security. So, we're going to go to file studio settings. We got network allow it in experience settings. Oh, okay. So, an experience is also like what we're playing. So, somewhere in

here we will have uh settings. So, I'm looking for that option. No, that's not helpful. I can just see if I can find the settings this way. There it is. We'll just add our own button. Then we don't have to figure it out. And so save to experience settings. Save to Roblox. Oh, this is when we like publish it or make it. So we're going to say go. And I'm looking for that network setting. So just a moment here.

I don't see anything in here that would say update existing experience. Maybe we need to save our game. So, we'll go ahead and save this first. Save file as. Just move this off to my desktop here. And I'll call this place one on my desktop. And then I'll go here. Maybe this experience settings was here this entire time. Oh, it's right here. Save to access. I thought I saved it. Oh, save it to actually Roblox. So, I'm going to go ahead and save it. This like actually publishes it, which is so darn silly. But that's how you can only use certain features, strange, strangely enough. And so now we have more options. So I go to security, allow HTTP requests.

Yeah, it can open third party uh third party tax. That's for sure. We're going to go back over our plugins. Click our button here. We'll go ahead and make sure we're connected. So now we're green across the board. Again, never used this before, but it's not like it's that hard, right? And okay, I'm going to go try again. I uh installed the plugin and gave it access. Okay. And so we're going to go ahead and try this again. We'll say allow session.

Okay. And so we'll go back over to here. So, what I'm looking for is did they place that object anywhere? Okay. Because if it was, it'd be in the workspace. Oh, it's right there. Where is it? Why don't I see it? I think if I hit F, I can jump to it. There it is. Oh, it's right in the center. Okay. Um, it didn't set the position. So, it did insert a spear. Well, I actually don't know. I'm gonna go delete this. I don't trust it. Let's go try this again. So, I'm gonna delete that sphere.

Okay. Uh, okay. I deleted it. Insert it again. Make it very large. And make it green. Okay. It says 1010. Maybe I don't know. It still didn't do what I thought it would do. So, run it again. And so, we'll go back over to here. Oh, there's definitely a sphere if we scroll on down. Um, so definitely is inserting a sphere. At least I think it is. We'll go back over to here. Is there any other parts it can do? How about a wedge? Okay, insert a wedge. um default position.

Okay, we'll say allow. And the wedge has now been inserted. The only thing I'm noticing is like it's not changing the color or anything like that. So like the sphere color and location didn't change when you first inserted it. Can you try updating the sphere? We're also we're on mini low um by the way, but it seems to be working pretty well on mini low. And so here it's going to try to set some properties. Okay. And it might be just the intelligence is really low and I haven't been paying attention. But we'll go back to here.

Put F to go there. Nope. So it just can't seem to update it. So it does work. Oops. Allow. Oh, no. There we go. It updated it. So, there you go. That's an MCP server. And we can use MCP servers for all sorts of things. You have different challenges with different services. But, uh, yeah, I thought that was kind of cool, but there you go. Hey, this is Andrew. In this video, we are going to take a look at sub agent. So, what I'm going to do is I'm just going to go back directory and we're going to make a new directory. I'm just

trying to think of something that um would be useful uh to build for this. I'm going to go ahead and do we already have a to-do app? I feel like that might be something we already have. Not necessarily. So, go ahead and say to-do app. And I'm gonna go ahead and launch Codeex. And I'm gonna go ahead and say I want you to build a um to-do app. Um I and I want you to use multiple sub aents. So basically when you define what you want then it will go ahead and create those sub aents. I think that we can define um sub aents in here. So, can we define multiple possible sub aents?

I'm going to put this into plan mode. Okay. And by the way, I'm still on mini, so that's probably not a great idea uh to be on mini here. I'm just going to stop that for a second. And we'll go back over to here. We'll say model, and we'll go back to default. And we'll go back to medium. Oops. I want to be sure I'm on plan mode. I'm on plan mode now. And we'll go ahead and run that. And so plan mode is going to figure out what kind of stuff that we need. Be back in just a moment. Okay. And so what I'm hoping that it will do, it's it's working here. But what I want it to do is create some agents for us. Uh and so here you can see there's like an explorer, reviewer, a researcher. And so

that is what we are trying to do. And so I'm going to say no. It doesn't understand. No. Oh, sure. Yeah. Okay. So, we'll say uh you know vanilla HTML CRUD first. And here it's talking about the sub agents define several candidate agents activate only when one is needed. Split uh implemented aggressively across multiple workers from the same time. Maximal parallelism. And so what I want is I wanted to create the agents directory and then make a bunch of TOML files for our different agents. And so here we go. And so what I'm hoping that it's going

to do is go ahead and create um in the to-do app. Where is it? My to-do app here. We have a to-do app folder, right? Simple to-do app. Here we go. Agents. And here it says project task manager. Oh, no, no, no. That's the data. I was going to say that looked uh very familiar. So, uh I mean we could go ahead and say yes, but I'm not sure if it's going to do this correctly. And the other question is like where am I right now? I can't even tell. So I'm going to back out here. LS type in LA. I just want to know that we did create that. I'm pretty sure we created that directory though, right?

Well, it's definitely not here. So I must be somewhere else. But as long as it makes it, that's all I care about. And now it's spawning those agents. I can't even see them right now. And so if we go to agent here, we can see that it created multiple workers. Okay. But the only thing I can't tell is like where is this directory? I'm going to stop this. I don't know where I am. Oh, I'm one folder into my Roblox folder over here. So we go here to app. I'm bringing this up directory because what I wanted to see while I was doing this, I wanted to see did it actually create any of that stuff.

CD back and then we'll go back to our to-do app here and we'll type in codeex. Okay, so it's fine if we quit out of this. This is smarter to learn. So we'll go ahead back into plan mode. I want to create a team of agents um uh for building a to-do app. Can you create me multiple agents um in the agents in the uh come on here? It's not letting me copy here, but we'll say codeex um agents directory. So, I'm not telling it to build the to-do app and I'm just telling it to focus on building those agents again. But you saw right away when we did that,

it started executing them, but it's not pulling from existing ones that we had there. I think it was literally spinning up other ones. Oh, maybe this was here before. This isn't supposed to be here. I think it's just when we dragged it over, it didn't show up and now it is. And it's it's uh looking around for too much stuff. It really should just be focusing on the 22 app. This is one issue even with Claude is that um they kind of get off track and they start looking all over the place.

Here we'll say React this time. Um specialized roles, but I don't want it to create the app. I just want it to create the agents. Okay, that's not what the doc shows. It shows toml files. So I'm just trying to click. I don't know what happens. Sometimes my browser freezes here. So I'm going to go ahead and say no, stay in plan. Um, they are toml files. This is for codecs. And all I want you to do and um so like it already made it's thinking about making markdown files which is fine but that's not that shouldn't work right if it's toml files that it wants

we toml files it should get so we'll just give it a moment here doesn't know about itself isn't that funny there technically is a skill that we can tell it to go look at um for uh whatever. So create a project scope codec sub agent under that location using standalone files define which uh each of them do. Yes. So it looks like it's just going to implement the agent. So we'll go ahead and do that. And so now we're setting up explicit agents. We'll give it a moment.

Okay. Um and so we have a bunch here. We'll go ahead and say yes. And we're not executing any plan right now. We are just uh the patch fail because does not exist. Of course this doesn't. You got to make it. And notice there's a codeex file. So that kind of gets in the way. Mhm. Yeah. See it? Yep. They got to move it. There's nothing in that file. It's just a placeholder file. Terrible design. Again, don't know why they did that. It shouldn't take this long to make them. I could be making them by hand right now, but we will let it fiddle around here. Oh, here we go. Yep.

Just delete it. I'll just make the folder codeex. I'll just make the folder agents. Okay. Um, we'll just say yes. I mean, I've already deleted it, so this will be a non-issue, but um I'm hoping it figures out that I made it for it. Come on. So, this is frustrating, right? So, almost to the point where I just kind of want to copy paste this stuff in. I'm just waiting for it to come back. I don't want to interrupt it because I don't want it to um muck this up. I'm gonna wait for it to come back here to me. Here we go. Now it's back. Okay. So, we'll go ahead and say yes. I think it knows now that the uh it can make the files.

This is resolved. Yeah, I resolved it, buddy. So, we're just waiting for those to come here. There we go. So, now we have our agents. We have our config file. Also max amount of threads, max amount of depth. So how much at one time can work and to what depth. Just want to check the depth value here. As my browser takes time to wake up. I'm not sure why. So want to search depth. So we have concurrent open agent thread craps uh thread cap sorry spawn agent nesting. So which allows a direct child agent to

spawn but prevents deeper nesting. Keep the default unless you specifically recursive delegation. So that sounds like we could have a lot of recursiveness there and so sticking to one would probably be a good idea. One uh repo quirk came up. Yeah, that's fine. Okay. we have multiple agents and I want them to create a to-do app. Use JavaScript with the Myithil JavaScript framework. Um, okay. keep it minimal the app for a single company. Okay. And so we'll go ahead and I'm hoping that it understands that we're defining those agents there. But

we're right now Oh, did I? Yeah, I'm in plan notes. It's making a plan. It's going to look at what agents we have available. And then what we're hoping to see is them all execute and work together. So I'll just give it a moment here to figure that out. Okay. So it's back here and uh it has questions for me. Shared company board one shared team optimized for internal coordination. Sure. Local storage sir. I want yeah internal tools. So we're basically building a bespoke application folks.

We'll give it a moment here to see what it comes back with. So it's creating a plan. I'd like you to write out the Uh let's just let it rip. Okay. And so now it's going to go rip. While this is going, I'm going to see if I can copy this content. Usually you can't copy like this. It'll just kind of kill it. If it does, that's totally fine. I don't care. And I'm going to go ahead and just make a new folder here, new file. We'll call it plan.md.

I kind of wish these plans would just save like this. There could be a plan directory. This is something I didn't uh check. But as that is working, I'm just curious if we go I'm already in the codeex directory in the home directory here. Whoops. Codeex. But I'm just curious if it actually has a plans directory. So that's something that Claude has is that if you make a plan, it'll just save it in a plans directory. It doesn't have it here. And so we'll go back here. Um,

now is it actually using agents? Stop. Okay, so it's not using agents. And it's a good thing I saved the plan. So I'm going to go here. Please use the agents in the please use a team of agents to implement the planned DD. I've defined agents in my agents folder. Okay. And the reason I stopped it was I didn't necessarily see Oh, but it was exploring these files. So, maybe it was going to do that, but it didn't see it. We'll wait a moment. Okay. So, just unp I'm pausing here. So, I'm checking the current repo. It's checking the toml files.

Here we go. And so, now it's spawning them up. So, spawned uh this person worker. I wish I could see exactly what it wh what it was doing. We'll go ahead here and just say sub agents. So, I mean it's definitely doing something. I just I can't tell which agent is running. I don't think it's using my uh my sub agents here based on this. So I'm splitting into two agent owned chunks now for codecs toml files and one what do you mean a minimum company but uh retargeting your infrastructure. So it seems like it's working but it's not actually leveraging the actual workers.

Oh, here it goes. Updated the four agent tombles under this to target agreed minimal scope. Oh, so it literally adjusted them for use. Oh, that's interesting. I'm now waiting on the app builder now and uh I'm reviewing both the outputs together with the lightweight validation. Well, that doesn't sound good. We'll go ahead and type in sub aents. Updating agent toml targeted for the scope. Important changes. That's not what I asked it to do. And so what I'm going to do is go back to the main thread.

Are you using the agents in the TOML uh the agents defined in the TOML files? You updated them and then you have another agent just building the site by itself. You were supposed to use the agents. Okay. instead of submitting that. Uhhuh. I'm going to go back over to our other one, this one here, and say, "All right, just uh you're done. Report back. You're done." It doesn't let me invoke the project define codeex to it only spawns generic workers. So I updated your toml agents.

I delegated the available generic worker and then only one worker. So if you want to work done then it needs to happen in a codeex runtime that loads project sub aents. I can either do the next step. What the heck is it doing? Okay. Stop. What do you mean? I have defined a plan here. I have defined agents here. I need you to use these agents as subtasks. Can you do that? I don't know why I couldn't do it. It should be able to do it. Okay. So, we will give that a try.

Like, what's the point of us making those files if it can't utilize them? The plan isn't currently empty. Hold on. I'll save it. Okay, I got to stop it again. I didn't save the plan. We're going to hit enter again. We're going to give it another try. I notice my usage hasn't really been going down, which doesn't make any sense, but the weekly's been going down, so I'm not sure. Maybe this value is not updating, but this is something. Oh, there we go.

Now it's showing the real value. I'm just going to confirm here. It says uh each file defines the configuration layers optional fields. Uh-huh. Well, we made it based on that definition. Oh, so now it's spawning them. Perfect. Okay. So now we're getting what we want. Okay. And so we can go watch the product planner. And so the project planner is basically doing implementation.

I'm not sure if this stuff came from before and it's working from scratch, but it is what it is. If you want, I'll proceed by delegating it monolithically. So, we'll go ahead here. I go back to the agent to the main task. So, yes, I verified the plan. The clean split is this. If you want me to continue, I'll route. Yep. Just continue till completion. Okay. So now we are seeing uh task. I wonder if this shows up as uh PS. Nope. Tasks. Nope. So it's just an internal list that it's maintaining.

I'm not sure why we're having this MCP issue here. Like why is it even doing MCP? I'm not sure why. Okay. Uh for some reason you were uh launching MCP Roblox server which we are not using but I didn't mean to stop what it was doing. I mean please continue to manage the agents till completion. So I just kind of interrupted it while I was in the middle of work. And so I'm assuming they're just kind of like hanging out here and they're

waiting for someone to talk to them. But yeah, it's clear that it's using um the uh coordinator pattern, which is what um sub agents I mean it's using the coordinator pattern. So I don't think agents are talking to each other. I think everything's being um funneled between uh based on the main one here. So we'll go back over to here. I'll check the front page now. Then I'll do a local pass to close anything the agent miss before handling the finite state to QA. And so it seems to be working. I'm just going to pause here and we'll let them go to town. Okay. And

either we'll hit up our usage or we'll have a product or they'll ask us to do something. All right, we are back. What do we have? So complete uh through the local uh flow setup. The app is now in a better final state. Verification completed. what I did not complete here. I did not run a real browser for smoke test. If you want the next step, I'll pass it more over to this. Uh, no, I think it's fine. We can go ahead and just take a look at the codebase. Um, and it's a very minimal application. So, this should be something very easy for us to test. I'm going to go ahead and just reveal this in my file explorer. And then I'm just going to bring over a new tab here. And we'll drag it over.

See if we got something. and it's a security origins issue. So I don't think it can't work. What we'll do here is just stop the application. I'll do HP server. Okay. And then I'll just go ahead and try this again. Oh, look at that. Oh, we got ourselves a little app. Hello. We'll add that task. It's actually nice UI. I like it. It's fine. It's mobile friendly. It's even mobile friendly. So, there you go. That was us using sub aents. A little finicky, but we did get there in the end. Okay. Cha chow.

Understanding When to Use I vs. Me in English Grammar

Understanding When to Use I vs. Me in English Grammar

Why Mindset Alone Isn't Enough for Success

Why Mindset Alone Isn't Enough for Success

Teaching Children Healthy Habits Through Nursery Rhymes

Teaching Children Healthy Habits Through Nursery Rhymes

Why Language Learners Give Up Quickly and How to Stay Motivated

Why Language Learners Give Up Quickly and How to Stay Motivated

English Subtitles

Read the full English subtitles of this video, line by line.

Loading subtitles...