Why Humans Are Still Better Than AI at Editing Documents

This talk explains why human editors remain superior to AI for document editing tasks, using examples from legal contracts and academic work. It discusses how document structure, formatting, and references create complexity that AI tools often mishandle, while highlighting the importance of reliable editing interfaces for professional and academic applications.

Full English Transcript of: Why You're Still Better Than AI at Editing Documents - CS50 Tech Talk

Hello world. Hello everybody. I hope everybody's doing well. Nice to digitally meet everybody. Um, it is truly such an honor for me to be here. I absolutely love CS 50. I took this course myself a few years ago, um, it's quite a few years now, um, and it was so inspiring to me. I, I can honestly tell you. But I still draw so much inspiration from this class every day at my work and. So thank you, honestly for having me and thank you for being here.

Um, a little bit above me. So today, I am the director of engineering at a company called Superdoc. When I joined the company, we were building a contract management product. What does that mean? Think legal tech. And when you want to manage contracts, You can imagine that the next thing you want to do is create them or edit them, right? So, we needed a way to put something like Microsoft Words functionality, which I hope everybody's at least somewhat familiar with uh Microsoft Word, um, and put all of the functionality inside our own app. And we thought This probably exists already, right? I mean, surely, but it didn't.

Not a true DocX editor anyway. Especially not one that we could extend and program or build on top of. And it had to be real Doc X. Customers who are often lawyers, um, they needed to seamlessly move the Doc X files that they already have, Doc X files being the Microsoft Word files. In and out of our app and then back to Word or to Google Docs, for example, which maybe some of you are more familiar with that one. So, We built Superdoc and we made it open source, meaning that our source code is completely public. And why do we do that?

Well, this seemed like a problem that could be solved better by a community. So you can actually go look at our source code right now. You can actually go look and see about everything. That Superdoc is. Literally from day one, if you, if you go backwards and start browsing the commits and Anybody, anybody in the world can literally just read the code, understand the code, use the code, and contribute the code. Contribute to the code, I should say. Um, just to be clear, Superdoc isn't an end user tool like Microsoft Word or Google Docs and that you download it and you run it on your desktop.

It's not, it's not a replacement editor. It's actually a set of tools for developers like yourselves. It's infrastructure for building your own Doc X applications. And it is our company's entire focus. So Now more than ever, software and AI need to work with these Doc X documents too, not just on Word or Google Docs, but Insight. Custom-built applications or automation pipelines, just everywhere. More and more people are building more software and more systems than ever before. And AI is starting to participate in those systems.

And a lot of the data that actually matters. Ends up in documents, right? What? You know, contracts, Or, or, or other, you know, reports and so on that they, they need context, they need data, information. And when you start looking, documents are everywhere. They quietly shape so much of how the world works, from our schools and our businesses to our careers. Think resumes. And the way governments operate. So Why documents? Why talk about documents or doc X in a CS 50 doc?

You might think of a document as just text or a bunch of paragraphs, lists, headings, and so on, but underneath all of that, there's actually structure, logic, state, and rules that hold it all together. These Doc X documents are actually much closer to software than they look. Now, think of the last document that you created or you edited. Maybe it was earlier this week, maybe you were working on an essay or an assignment, maybe it might have been earlier today.

Maybe some of you are even taking notes in one of these apps like Google Docs right now. I actually used to use Microsoft Word myself for my math courses and used to write my math assignments in Word because you can use the equations feature, which is actually quite cool if you've never tried it, give it a try. Um But think about it, but just think about that experience of editing the document. What do you remember about that experience?

I don't mean about the contents of the document, of that last document that you were working on. I don't mean your ideas, I don't mean your research or your notes. I mean the actual experience of editing that document with one of these applications like Google Docs. You probably can't remember much about it, right? Maybe you've never even thought about it. And that would actually mean that these editors are working as expected.

It means that the long history and evolution of these tools, and they've been around for quite some time. Has led us to a point where the tools themselves disappear into the background, and they let you focus on what you want to do. To put your ideas on quote unquote paper. And then to be able to share those ideas with others or to collaborate on those ideas with others, right? Usually if you write a document unless it's just your own personal notes, you're probably gonna share it with someone.

The process for you or I to edit a document in something like Word is mostly. Unremarkable. And it should be unremarkable. A doc X file, the actual file type that Word uses to store these documents, feels like it should be simple too, or at least simple enough that we don't even have to think about it. And since the last time I checked, it's 2026, which also means the future. We have this whole new set of incredible tools available to us that help us refine our ideas, they help us write, they help us edit.

They help us analyze. Things like our friends, the chatbots. Think of chat GPT Claude. Gemini co-pilot. Depending on which one, which flavor you prefer. I'm gonna mostly refer to these as agents, although I'll admit that's not strictly accurate, but let's just not get lost in that right now. Um, these agents, as we'll call them, are powered by something called large language models or LLMs.

Now, if you're not familiar with LLMs or you haven't covered the chapter in, uh, the AI chapter in CS 50 yet, that's OK. Don't worry, you don't actually need to know anything about LLMs here, in particular, at least, just know that they exist and there's something called a large language model that powers something like Chat GPT. And you probably know what these agents are capable, capable of firsthand anyway. They can help you with work, ideas, writing, and so on. So let's have him help us with a simple example.

Here I have um on the left, Chaj Pit, on the right we have Claude. And I have this very simple document. It's a Word document. You can see hello CS 50. Welcome. And let's just see if we can get our friends, the chatbots to do some editing with us. So I'm gonna add the file to both. And I'm gonna ask, let's just start with a real simple question. And ask To add a paragraph saying hello to everyone at CS 50, using track changes.

Now if you're not familiar with what track changes are, Microsoft Word and Google Docs have this feature, sometimes it's called suggesting, where if you make a change, Hello everyone. It tracks the change, so here you can see it's. Told you that I deleted the word welcome and I made a change here, right, and you can accept or reject that change, it's actually very powerful, um, standard procedure for somebody like a lawyer, for example, you can accept that change or, or not. So Pretty normal, simple operation. So let's ask both.

Let's see if they can help us here. They're both pondering, and after just a moment. Let's see what we got. So ChaGBT came back first. Sometimes Claude comes back first, sometimes Cha GBT comes back first. Um, and ChaGBT then show us the document. OK, fair enough, it's telling us to just download the document. OK, fine, let's try and do that. So, I've downloaded the document. And look at that.

It did add a paragraph. OK, cool. But it's not a track change, right? Notice that there's no track change bubble. I have no way to accept or reject this, right? So it's not really what I wanted because if this were a contract now I suddenly wouldn't know what changed, right? This could be 1000 pages long and I Wouldn't know that this particular sentence had changed.

This particular example is very simple, but. So, I didn't quite get to what I wanted. OK, let's see what Claude did. So actually Claude just saying straight up, straight away, I'm not able to help you here. I'm not able to add track changes. Option one, do it yourself and work. OK, fair enough. Um, so I'll, I'll probably just give up on clot there for a moment. Let's try a different one with Chai PT though, just something a little bit more interesting, so.

Let's take This license agreement and I can actually maybe I should show you what the thing looks like first. So it looks like this and it's a very real, not really very real license agreement between Rubber Duck Incorporated. And Superdoc Enterprises. And if you look at the list here, number one says definitions, license software. And what I want is. To make this into a new list, because I'm going to have multiple definitions, other definitions, right? So again, if I ask you to do that, you can probably do it. I just did it in about 3 seconds, right?

Now, I'm gonna prompt split, split definitions into one definitions and one A license software, so. Let's see what it can do for us, you know. Let's go to claw too, why not? OK, so let's see what Chat GBT can give us here, if it's much help for us. So, this is the result from Chad GBT.

Let me actually just put it on the right, put this one on the left. So It appears it didn't do it, right? You can actually see license software. OK, so nothing really changed there. What did it actually tell us? Actually it's telling us that it did it, but it didn't do it. It didn't actually do it at all.

Now if you trusted what you read, you may think, cool, my contract has been updated. I'm good to go. And yet you wouldn't be. Right? Now if you look at Claude, It actually decided to just put this in line. It didn't actually put it in the document, so I could, I suppose I could copy and paste this. It actually did understand what I wanted in this case. Um, but it didn't actually put it into our document. So you can imagine a scenario in which one of these requests.

Could have possibly added, for example, just another bullet, right, like. The requests were so um ambiguous or well the request wasn't ambiguous, the results were ambiguous, and you can imagine that something like this happens, right? And then if we read further down our contract, and most contracts work this way, some other clause is referencing Section 2A. Well, Now section 2A doesn't have the same meaning anymore, right? So This isn't just my contract not really working, this is much worse than that.

Now my contract actually is incoherent, it, it actually doesn't, it's not valid anymore, right? So, unfortunately, it seems. That these aren't really working for us. You can imagine what we'd really want here, maybe. It's an experience something more like Word or Google Docs, right? Right inside the browser, right alongside what we're doing here, not separate from what we're doing, but part of it.

Like all the power and simplicity of Word, but right in the browser with the app. Something like this, for example. And this is a full application. It mimics a Chad GPT application, of course it looks a little different. But this one has Superdoc in it. This Superdoc, the developer tools that I mentioned, power this application, so you have a real editor here. And You can see that earlier, and I did this earlier just on purpose because I wanted to move this along, um,

I prompted it the same thing, split definitions into one and one A. And keep the licensed software text under 1A, and you can see, see the result now. Maybe you don't actually believe me that it can do this. So why don't we just try it? So I'm going to reload the document. You can see that definitions and licensed software once again one thing. And I'll send that prompt And let's see if it'll do it now.

Hopefully, sometimes these things work, sometimes they don't. Actually, it did work. So you can see it actually understood exactly what I wanted. It did it, and I could see it in real time, right? So this is more like the experience that I think we expect. And it just works, right? And so, let's explore a little bit about how this is working. Imagine that these simple examples, uh, imagine that actually instead of these simple examples, you wanted to actually.

Get some help with your essays or your resume, something, something important, um. Based on how those examples went with, with the Chat GPT and cloud, would you use them? Probably not, right? They didn't really help us here. They actually made the process of editing highly remarkable in a bad way, but set aside the experience of editing itself, right? The results themselves weren't just bad or unreliable, they, they were unusable. We, we actually can't use those results. So why is that?

Why is it that you or I can do this sort of thing with such ease on Word or Google Docs or Superdoc? But we weren't able to perform the same simple edits with our friends, the chatbots. Let's try to unpack that question, or. Should I say, unzip that question. So, what is a duck tax file? We unpack that question. We can do it by just looking at the same original file that we had, so let me just open that.

It's just this hello CS 50 file. You might recall it said welcome. And here in my folder, you can see it ends.docx, right? DOCX. And what this file actually is, it's actually a folder containing other folders and files. What do I mean by that? Well, it's actually a zip file. If you're not familiar with what a zip file is, it just means you take a file or you take a folder and you can press it.

Meaning you make it, you group it, and then you make it a little bit smaller, smaller file size, and I can prove this to you. I can prove that this is a zip. So if I go over to my terminal here, and hopefully you're familiar with the terminal, but even if you're not, it doesn't really matter. There's a little tool that I can use called unzip. I can tell it that I want to unzip. Hello, CS 50.

Duck X. And I'm gonna give it a destination. Let's call it, um. Hello. So you can see that we did something called inflating, and then over here we have a folder called hello. Now if I go into that folder, this is what's inside our doc X. You can see a bunch of folders, a bunch of files. And There's a lot of things called. XML, right? And you can see that if I open them, start browsing around, they look pretty cryptic. But They're still just text, right? And there's a lot of them. So it turns out that our one file has a lot going on. And

in case you don't know, XML is a file, is another file type, like, um, David was mentioning, I think BMP earlier. Or JPEG or something. Um, XML stands for extensible Markup Language. And that's just a fancy way of saying plain text that combines structure and data. It kind of looks like. HTML if you're familiar, you may not have gotten to that part of CSFT yet, but that's OK.

You've probably seen a website before, means you've probably seen some HTML before. Let me show you what I mean. This tool is also built with Superdoc, and what I've done here is I've taken the same file actually this is, this one doesn't have the welcome, but that's OK. And you can see here the drop down actually let me zoom in a little bit you can probably see it a little better. You can see that XML a little bit better formatted, but you can see it's still pretty cryptic, just a bunch of tags, right? You have these angle brackets. If now if you know HTML,

this probably starts to look real familiar. And if I start typing on the left. You can see that it's changing on the right. Right? So The document that you're used to. And the editor are combining to somehow change what looks to be sort of like source code. And this can get pretty complicated over here. I mean, this one's pretty simple, but for example, those track changes I was talking about, let's say I wanna replace this with um.

Hello, today, I don't know. Um, you can see that all of a sudden a bunch of stuff appeared over here, right? There's a whole bunch of things. And just in case you don't totally believe me that this is in fact. The real file that you're looking at. I can say hello world maybe. And here, I'm gonna rezip it.

Oh, didn't go. OK, that's OK. It should appear here, that's OK. Actually, you know what, let me show you something more, more exciting, um. Let me go to Different files. So again, there's multiple XML files. This one's called document. XML, but there's all these other XML files. Why, why are they there? If you have any experience with HTML, you can sort of think of other files like CSS, although that's not really totally accurate representation, but let's go with it.

So this one's called Styles, and there's this thing called, you don't need to understand any of these tags, don't worry. Um, there's a thing called doc defaults. I assume that means document defaults. I can do something like. The B tag. Let's see, actually, it's not gonna let us. Oh, that's interesting. Let me try it again.

Hello, see it's 50. Let me show you this if it'll let me do it. Let's try it again, WB tag. Oh yeah, OK, so you can see everything on the, on the left. Became bold, so the B tag and word language means bold, kind of like an HTML actually. So I've edited the document to have a default of bold. I can actually remove this as well. Remove that, I can reapply it, you can see everything suddenly became not bold. So Our simple XM our simple darkx file is not just one thing, it's this whole zip file,

it's a whole bunch of files compressed. It's a compressed folder. Doc X It's not just that one thing, it's a combination of things, and those things or XML files. Plus a few other files that can appear in there. I don't want to forget other files, but we don't need to worry about those yet. Um The format you see on the screen when you open that file in Microsoft Word or Google Docs or Superdoc, right, your editor. And it's a sort of illusion.

It's more like a representation or a projection. I think about what else in CS 50 have you seen that looks a lot like this. I would argue, almost everything. Think about what you build in CS 50, your assignments. They usually involve one or more files, usually in a folder. And those files by themselves are not the final thing, they are the source. Literally, they are the source code. And somewhere between those files that you write and the thing you experience, your app, something has to interpret them, compile them, render them, or otherwise bring them to life.

If you're writing code in C. That might be a compiler. If you're building for the web That might be a browser like Chrome. In both cases, there's a layer between the human readable file that you write, like say our XML. And the final program that you actually use. And so a document is not just the files in a folder either, it is somehow the experience produced from those files by tools like Word, Google Docs, or Superdoc. And the reason those tools feel so natural is that they hide the complexity that's happening underneath.

You don't have to think about the system. You don't have to think about those files and folders and XML things. You just write, edit, collaborate, format, bold, and save. That in computer science, is what we call an abstraction. It is a way of hiding lower level details. So that you can focus on the higher level problem that you actually care about, in our case, your ideas, your essays, your resume. And abstractions are incredibly powerful.

But they're also a kind of promise. The promise is that all of that hidden complexity is going to behave itself. So the structure underneath. We'll preserve what you meant. So the document that you wrote here. will still be the same document that someone else sees over there. And that promise gets tested the moment the document has to move. And most documents do move. Most documents are made to be shared. Or to collaborate on, they have to survive a journey.

Let's take a pretty real example. Imagine two companies negotiating an acquisition. Each company has their own legal team. Each side might also have an external law firm or some other kind of group, maybe multiple law firms. And within each of these groups, so you have Company A, you have Company B, within each of the groups, you're gonna have a ton of different people involved, reviewing, editing, commenting, making changes, maybe track changes. So what actually happens to that document, the document's going to get passed around, right?

There's going to be different versions of the document, internal, external, they're going to get passed from one company to another. And all of these people might be in different systems like uh Windows or Mac OS and some might be using Google Docs and some might be using something else. And here's the important part. Even when the words stay the same, the document can change. What if this contract that we're talking about makes reference to some section 2A?

You might remember That example from here, right? We had some section called 2A. Legally, clause 2A has some specific meaning and some specific language. If some edit happens to that document unintentionally that changes the list numbering. Now referencing to A in that document has an entirely different meaning. Think of it like a pointer in computer science terms. A pointer doesn't hold the thing itself, it just tells you where to find it. And that works great as long as what it's pointing to doesn't move, right? But if the underlying data shifts, if something gets inserted, deleted,

rearranged in some unexpected way. At least in our document analogy here, the same pointer can suddenly point to the wrong thing. The pointer didn't change But what it means that Suddenly, you have a big bug in your app, or A big bug in your legal contract, if you will. Either way, it could mean disaster. And if documents just stayed in one app forever and with one person, this wouldn't really be a problem at all, but they don't.

They move across people, companies, tools, and systems. And in that journey from system to system. The document can start to unravel. Let me show you a quick example with just Google Docs and Word. So, in award, we have this thing called fields, and you can see that when I click up here, rubber duck. It shows a little box around it that indicates it's a field, and I can actually, every time I click on rubber duck, you can see the field, hopefully.

Um, like here, right? And one of the things this is doing is that if I change this to say, I don't know, Nick Real. Enterprises. You're going to see a change everywhere in the document, right? So you can imagine, this is important functionality. But when I upload the same thing to Google Docs and I start making changes, nothing happens. Right, you actually see from a previous demo that It's no longer coherent, it doesn't do the same things, and if I were to export this and then open it in Word, it's not going to do the same stuff, it, it's lost functionality.

And this is between those two apps. Now imagine if there were other apps that could take away other types of functionality. Here's one other example real quick. This is a very fake essay. And it's got some endnotes, and endnotes always appear tucked underneath the end of the last paragraph in the document. But when I opened it in Microsoft Word, You can see that it's changed. there. They're not end notes. You can actually see that they've been split up.

There's 3 up here and 2 down here. They're actually footnotes. It somehow translated them into something completely different. And it changes my document because now my first page ends somewhere else, my second page has a different length. And not only that, and I don't know if you can see this cuz it's probably really tiny, but. Originally I was using. Lowercase Roman numerals. Let's try zoom in. But Google Docs has decided to change them to real numbers, again, more changes. So just going from Word to Google Docs has made a bunch of changes to my document. And

going back, well, there's gonna be some more changes. even with just these two, the two most well established editors, there's some really critical differences. And the journey isn't just. Google Docs, Word. And back to Google Docs again, right? Or, or vice versa, I guess work to Google Docs and back to work. It's some custom app, export it, goes to another system, import it, maybe some AI edits it, it goes back to somebody, maybe you email it to someone and then that person has to edit

it with co-pilot and then it goes back into some other custom system and around it goes. And if those systems are making any changes to the document, They better be doing so correctly and safely. It's not just people Editing documents Software has to edit the documents too and understand them, and not in some vague kind of works most of the time sort of way. It has to be reliable, it has to be consistent. The documents have to survive the journey. But wait a minute.

Why do we care so much about Doc X? Why not dot pages or.text or. MD files? Why do Doc X documents show up in so many places? Well, this didn't happen by accident. In fact, if you've been around documents long enough, you may remember that the doc X file. But before the Doc X file, Microsoft Word used the.doc file extension without the X. That original DOC or doc file was not a zip file containing XML as a binary file, basically machine speak if you will, ones and zeros. That only Microsoft Word could read. So we couldn't have just cracked it open and peeked inside to see what,

what was in there, the way we did with our, our Doc X. But the doc X format that we've been talking about is based on something called Office Open XML or OOXML. And that word open is doing some real heavy lifting here. Historically, document formats weren't really quote unquote open. They were tightly controlled by the applications that created them. So if you wanted to work on a Word document, well, you needed Microsoft Word. If we travel back in time for a moment to the very long time ago of the 90s and early 2000s.

Documents became central to how businesses, governments and organizations operated. And there was increasing pressure to make sure that documents could move between systems, that you weren't locked into one single tool just to edit or read your own data, your own data. Some of that pressure was legal because if one company controlled the format for documents, well, that would mean controlling entire ecosystems around them. And industries.

It can start to look Like a monopoly. And that became a big, big deal. Because documents weren't just files anymore, they were infrastructure, and they still are. And this is something This is where something called antitrust comes in. Antitrust is about making sure one company doesn't use its control of something like the DC X or the DOC at the time. In a way that harms customers or prevents competition. And Microsoft had already come under major scrutiny in the late 90s and early 2000s. So there was real pressure, including regulatory pressure, to make document formats more open so anyone could work with them. And around 2006.

20 years ago. Microsoft moved away from that old binary.DOC and introduced the.docx format that we use today, Doc X. And then by 2008, the Doc X format became an international standard, the OpenOffice XML standard. Now, Instead of a document being controlled by one application, It became something that at least in theory, anyone could implement. The specification itself, something like, 8000 pages long, so in theory, um, but this is how we can even be having this conversation today.

This is why DocE documents show up everywhere. It is quite literally the international standard for documents. So Let's go back to that document journey for a second. As documents move from tool to custom tool to another system and then back into Word, all these tools need a reliable way to read documents, understand them, and change them. If we zoom out a bit from the narrow scope of documents and just look at software in general, how does software normally interact with systems and data?

Let's imagine for a second that you're building an app that needs to send text message reminders to someone. You're probably not going to build your own cellular network. You're not going to figure out radio towers and cellular protocols and all of that. Instead, you need that cellular network to tell you how to interact with it. Right, they can tell you how to interact with it and they can hide their complexity.

They can just give you instructions. Remember those abstractions? I mentioned earlier The same idea shows up everywhere in software. If you want to process a payment, for example, you don't need to build a bank. If you want to store a file, you don't need to go build a hard drive. If you want to send an email, you don't need to build an email server. And if you want to make apple pie, you don't need to first invent the universe. You just need a recipe, get the ingredients. Make some tasty pie. In computer science, we give that layer a name.

We call it an API that stands for Application Programming Interface. An API Defines a set of actions It's a safe, predictable way to interact with the system. It's like a contract. Given the same input, you get the same result every time. We call that determinism. That matters more than it might seem here, because software doesn't just run once or make an API call, which is what we might say here, which means triggering some action, just once, assuming people are using your software, it happens over and over, millions of times, and every time it happens, it needs to behave the exact same way.

Without that, every interaction becomes a guess. Every result could be a little different. That would not be acceptable for software, and as we've seen, it's not acceptable for documents either. Think back to that experience I referred to earlier at the beginning as unremarkable. Yours or my experience of editing something in Microsoft Word, the experience itself. The reason it is or should be unremarkable is because Word, Google Docs, or Superdoc are purpose-built, proven interfaces for humans to interact with the system underneath that Doc X. But software needs its own kind of user interface.

It needs an API. Humans work on documents via Word. Software needs to work on documents via a document API of sorts. So, earlier, We saw that we couldn't really collaborated with Cha GPT and Claude on our document examples, right? And I've asked a question that I've not actually yet answered. Why is it so simple for us humans? To edit a document in Word. But the agents were sort of helpless here. I hope you're starting to see the answer to that question. We're asking the agents to do the wrong thing, or at a minimum, we're asking them to do something that they can't do.

It's like asking them to make that apple pie. Without a recipe or an oven. Maybe you've seen this image or a video of it before. This exists somewhere in the world, and it's basically a mat with a bunch of buttons on it, and each button is assigned some specific word or action. And when the dog presses the button, it triggers sound for that particular word, so. People can train their dogs to push certain buttons depending on what they want, and the dog can sort of so-called speak. So there's buttons for outside, button for food, button for play.

The dog isn't generating language from scratch here or it's not, it's not even really talking, it's just pressing buttons. But I've learned to press those buttons in whatever way this dog, I think this dog's name is Bunny. Um, whatever way it learns best, probably a treat, right, or a biscuit, and Each button does something really specific in a very specific way every time, sort of like what we expect of an API. And that's basically how you or I edit a document anyway, right? When you're in Word or Google Docs or Superdoc, you're not thinking about XML or file structures or how numbering works under the hood.

You're just doing things, adding some track changes, adding a footnote, inserting into some list. You're just pressing buttons. And the reason that Chan GBT and Claude struggled to collaborate with us on the document or the documents that we tried. It's just that we're not giving them the right interface or the right tools. We should be giving them buttons to push.

Press this button, add a comment. Press that button, create a track change. Press this other button, insert a footnote. And those buttons are the API, the document API that I mentioned earlier. And this is, this is a very common pattern for how all software works. Our agents already understand this concept out of the box, as in, oh yeah, I, I see that there's a button for what my human friend wants me to do here, let's push it. So we don't need AI to understand every detail of this document form, and we don't need it. To generate perfect XML, we don't want it to try and understand thousands of pages of open XML specifications just to make a simple edit.

We just needed to push the right button at the right time. So Why don't we try that first example again? Or a similar example, I guess. But this time, I'm gonna give the agent all the tools it needs. I'm gonna give it Superdoc's document API and some instructions on how to use it. Although, I'm gonna avoid customizing Chat GPT, which one can do, but it's a little clunky, and I'm just gonna use the underlying LLM.

Remember, LLM, large language model, is the thing that powers Chat GPT. So, I'm actually gonna use the same exact model within our. Up here that up, there we go. OK, and So let's see what could happen if I give the same agent, again, it's this is the same model. I don't want to distract from that. I'm using Chai GPT 5.4 here and I'm not making the model smarter in any way.

I'm just giving you the right interface. I'm not giving it more intelligence. Just the right tools. So, let's try an example. I'm not gonna start from the same examples from earlier, but here we have a contract, this particular contract is fairly lengthy. And what I want to do is I actually want to insert a new clause between these two clauses. So, I'm gonna ask you to add a new clause called product recall liabilities. You can see that we have assumed liabilities and excluded liabilities between.

Those two items. And It's gonna think just the same way Chad GBT thinks, and it is just Chad GBT and after a moment you can see that it's added, edited the document. But notice something interesting as well, it didn't just add a clause for us, it actually followed the formatting patterns of the clauses around it, and this is important because if I had asked you to do that, you'd probably do that too. You would have looked up and noticed, oh, clause 1.3 starts with a little underline, and so does clause, well, previously 1.4, now 1.5 also starts with a little underline. my clause 1.4 should probably do that too, right? So,

it can look and understand the context, not just of the words. But of the formatting of the document, and this is critical for maintaining coherence, readability, and meaning within important documents. Let's take a look at another example, so. I'm gonna open up a new. The document here. Waiting for Chad GBT to join us. We go. So, I'm gonna ask it, this particular, again, fake example is um a Series A preferred stock purchase agreement.

Now, it says the word Series A whole bunch of times in this document. Attempt to count how many? Trust me, it's many. And I'm going to ask it to change all instances of Series A to Series B using track changes. But what I want to show you here, and I want you to keep in mind is that the agent is literally collaborating with us in real time. This isn't like the file goes somewhere and then gets changed and then I have to refresh. to illustrate that, I'm gonna send the prompt.

I'm gonna start making changes, and then you'll see after a few moments. That the agent actually can make the correct changes on a changed version of the document, so. It's going to get a request, it's going to look at a document, and then some time passes and, and some changes have happened to that document, and then it's going to add its changes once, once it figures out what to do. So I'm asking that and then I'm just gonna start making some changes so I can write hello, or so the location of all of those Series A things that I was looking for, locations of Series A.

Have been changed. Oh, I didn't even caught that one. Fascinating. And so you can see that. Even though I was changing the document in real time. The agent was able to do the track change exactly as I asked, and in fact you can see now that there's a lot of Series A's changed to Series B with the track changes. So Me, the human, and the machine are collaborating together in real time. And again, this isn't like I made TGBT smarter.

It's the same Chad GPT. I just gave it some buttons to push the document API. So there's a couple more examples now, and these will get increasingly. More fun, I think, hopefully, um, so there's this word green light capital, one of the examples I gave earlier was. To add a footnote, so let's add a footnote to Green Light Capital that says this is a real contract. Generally. The type of footnote one would add to a contract, I suppose, and you can see that almost immediately I went added footnote.

OK, so here we have a little number one. And you can see that it added our footnote. And the important thing here is, again, This is real Doc X. So if I download that and I open this in work, it really zoomed in here. Yes. Changes to print layout, which I think we all probably would prefer. You can see that all the um track changes are there. Our footnote is there, our footnote is here. This is real, Doc X. With an agent This is that experience that you have when you're in Microsoft Word, except now you're collaborating with an agent.

Again, not by making the model smarter, but by giving it the right API. Now What if I wanted to change the amount of money, so I don't just want it to be, I don't know, what is that, there's so many zeros, 100 million. So this is called a round size. Let's see if we can do it. So, I'm gonna ask you to change the round size to a billion. I'm not actually even gonna type all the zeros, there's too many zeros, so I'm just gonna leave it to a billion.

Let's see if you can understand what I mean by, first of all, by round size and what do I even mean by billion? Let's find out. Um, I look at that, just after a moment. And somebody can check the number of zeros for me, but it looks correct. changed around size to a billion, so it's doing all of its chat GPT things of understanding the context and understanding intent, as it were. And Applying that in real time. To the document. So, it seems that our dog, Bunny, speaks after all. Or at least It has learned which buttons to press and when.

And so, our humble dog hex, now 20 years old this year. That's quite old in technology terms. In dog, in dog years, that'll be older. Technology terms it's very, very old. Um, our Doc X is a place where computer science, collaboration, and AI are all colliding at once today. With real world consequences, and I wonder beyond documents. If we're really moving toward a world where humans and machines are working together more and more. else are we asking them to collaborate without actually giving them the right tools to do it well? Where else do other systems like this exist?

Where interfaces are built for humans. But not yet for machines. They are likely hiding in plain sight. Thank you very much. That's all I wanted to say today. Thank you for joining us.

English Subtitles

Read the full English subtitles of this video, line by line.

Loading subtitles...