25 May 2026 9 min read Leon's Notes

Homework Was Never Secure...Now What?

A practitioner working through three design directions for at-home work in an AI-saturated classroom — and what other people are doing.

Photo by Jessica Lewis 🦋 thepaintedsquare / Unsplash

The scene that started this

My assignment for an A-Level Econ class: each student gets a different essay prompt — a small anti-collaboration move I was proud of at the time. They were asked to present, not just hand in their work. They were to read the essay aloud, talk us through the argument, and take questions.

The essays from the lower end of the set made sense but were wordy. When the students read them out, they could not explain what their own words meant. They could not say why they had included one point and dropped another. It was painful to watch.

I used to get annoyed. It was academic dishonesty. They should've known better. I have since realized that the cheating frame does not help. The assessment design handed them a route that was easy, quick, and met the requirements without having to think. I never set expectations for AI use. Heck, I didn't even model by disclosing my own AI use in writing the assignment. They, like me, saw a chance to use AI and took it. It was not their fault.

Sarah Eaton has been making a version of this argument at the policy level for a while. She calls it post-plagiarism — the era where hybrid writing becomes the norm, and the question stops being whether students cheated and becomes what writing is for in an AI world.

Homework was never secure

Faking homework completion did not start with AI. Textbooks have answers in the back. Tutors do the working. Parents help. Answer keys live online. A photo of the problem gets messaged to a friend in another class. With AI, though, skipping the thinking became a lot easier.

I used to be able to assign MCQs and reasonably trust students had done the hard work because asking a friend comes with its own stigma. Now AI can answer them in a second and it can keep it a secret from me and their friends.

Pew found 26% of US teens 13–17 used ChatGPT for schoolwork in 2024. That share doubled in a year. Worth noting though, teens themselves split on what counts. About half say using AI for research is fine. Only 18% say writing the essay with AI is fine. 42% say writing the essay with AI is not fine.

Faking homework completion is older than ChatGPT but ChatGPT's coming means teachers can no longer afford to ignore it and hope for the best. Incentives are high and risks are low.

The real problem is not AI

At-home work has always been about answer-delivery. The cognitive reps were a byproduct. The reps came from the friction of getting to the answer — the friction of looking it up, the friction of writing it out, the friction of working through a problem you could not yet solve.

AI did not change what homework is for. It just means for teachers that measuring the output no longer matters because the effort can be divorced from it.

Yes, students still need to learn what AI can do

There is a harder question under this one. People keep asking whether students still need to learn the work AI can do.

I think they do. For example, a chatbot I built told a student his essay would be marked out of 25. It would not. The bot did not know what it did not know. A student who could not check the bot's claim would have revised toward a target that did not exist.

They do not need to learn the arithmetic the bot can do. They need to learn enough to spot when the bot is wrong. Schools have not figured out how to keep that kind of learning happening in an AI world, so I gave it a shot.

What I shipped: Reedlet

I vibe coded a tool. It is called Reedlet. I am the only developer; I pay the API costs out of my own budget.

Reedlet is an AI-powered workflow for essay writing. The student moves through four stages with a chatbot: idea generation, then outlining, then writing, then editing. The bot asks questions. It requires intermediate work — an outline before you draft, a draft before you edit. It gives feedback. It will not write the essay. The conversation is saved and I can read it, and the bot was trained to score the interaction.

The design was a bet on reintroducing friction using AI. Then I read the saved conversations.

Some students spent hours trying to get the AI to move onto the next stage without writing the essay. Others didn't take the task seriously at all. It wasn't pretty and it's the reason why I am moving back to discrete touchpoints. Here are a few problems I encountered:

One-word replies. Students minimum-effort their way through the bot's questions to reach the next stage. The friction is there. The engagement is not.
Off-topic conversations. The bot is patient and students notice. Some chats wander a long way from the essay.
Hallucinations. This is the one I want to be specific about. The bot told a student that A-Level Econ essays were marked out of 25. They are not. AS is marked out of 12. A2 has been out of 20 since Cambridge International updated the spec in 2023. The student had finished an essay and asked the bot for a predicted mark. The bot answered using the wrong total. I caught it before it shaped his revision. The bot did not know right from wrong. Only the teacher still in the loop catches that.
It still can't replace me. Reedlet was supposed to scale access to me. Instead, it became a lesser knock-off. The student got feedback from the bot, not from me. I accidentally removed myself from the trust building feedback process. In the future, if I adopt AI enabled workflows for my students, I need to keep myself in key touch points to quality control, and maintain motivation. This will be the next version of Reedlet and why AI won't replace the teacher in the room.

Where I am now

For the time being I assign homework and I do not grade it. I check completion. The work does not go on the gradebook unless it is done in class.

The reason is simple. At-home work where I cannot see the friction is not assessment data. I do not know if this is the right answer. I do not know if it is wrong either. It is what I am running with while I figure out the next move. I also wondered if my students would do the work if it "wasn't for marks." Turns out most of them still did.

Three design directions

To summarize, I am reimagining homework from three angles:

(a) Reintroducing friction inside a traditional classroom

Two things I have tried asides from Reedlet:

Error-spotting in AI essays. Students get an AI-written essay and find the flaws. This was hit or miss. Some students engaged deeply. Others didn't want to analyze AI-generated work. Fair enough.
Short in-class written or MCQ checks. Low-stakes, graded, in person, no AI in the room. I am committing to do more of this next term. This is a really good way for me to tell what students know on the spot.

Phil Dawson and colleagues make the academic case for this move — design assessments AI cannot trivially do; stop trying to catch AI after the fact. They put it as validity matters more than cheating. Focusing on validity motivates us to redesign assessment.

Worth saying out loud: the in-class check worked partly because the incentive came with it. No AI in the room. A real outcome attached. Time and peer pressure are a thing. The friction was in the structure of the moment. Not in clever task design.

(b) The flipped version that might work

A flipped classroom that preserves the teacher's actual job is one bet. The teacher who knows the students, plans the activities, curates the sources, gives in-person feedback, guides reasoning live, and is funny and human in the room — that teacher is hard to replace.

A flipped classroom that swaps that teacher for a video is a different thing. A flipped classroom is just an online course if implemented poorly. And we all know the completion rates for those.

The teacher teaches who they are, not just the content. I've had trouble doing this because not every student will do the learning outside of class, effectively wasting their in-class time as well. But provided that I am able to set up interesting and cognitively demanding activities in class, and blessed with an ambitious group of students, I am willing to try more flipped classroom-style teaching.

(c) Homework as pure formative

Off the gradebook entirely. Assessment lives at in-class checkpoints. Out-of-class assessments are not graded. That's what I am currently trying with my ambitious group of students. This works as long as the ungraded assessments have strong alignment with graded ones.

What other people are doing

Quick tour. Not a literature review. I just wanted to see some ideas on what it could look like Monday morning.

Curtis 2025 gives peer-reviewed pushback on the binary secure-versus-open model. His title is the two-lane road to hell is paved with good intentions, and he argues for a middle lane where AI use is permitted under explicit constraints and enforced through layered integrity practices. The middle-lane framing supports AI in the prep work, not in the deliverable, with students handing in both the AI conversation and the unaided draft. There is a caveat I have to name: a student can run a parallel conversation and only submit the sanctioned one. Disclosure tactics ride on a trust baseline you have to build first.

Sydney's two-lane is the framework Curtis is responding to. Lane 1 is secure, in person, AI controlled. Lane 2 is open, unsupervised, AI permitted. I use it as a reference point, not as the answer, but it is useful for naming where in between your assignment sits. CIE AL external exams and closed-book in-class exams are fully Lane 1. Everything else is up for design. Here's an idea I will try: an essay-planning homework where students brainstorm with AI at home, then come to class and produce a five-minute oral synthesis with no notes. AI does the divergent work at home; the convergent synthesis under time pressure is what I assess.

The AI Assessment Scale, revisited proposes five levels of AI integration from No AI through AI Exploration. It is a granular alternative to a binary lane. AIAS gives a level-by-level vocabulary for the question what cognitive rep is this task producing? Each homework task gets a level label on the assignment sheet — students know what AI use is in scope before they start, and teachers know what they are asking students to practice. Here is what the five levels could look like in an A-Level Econ class:

Level 1, No AI. The May exam is already a Level 1 by structure. So is a short in-class written or MCQ check at the start of the lesson.
Level 2, AI-Assisted Planning. A homework where students use AI to brainstorm essay angles. They come to class and debate without AI on a unified outline that ranks, rejects, and synthesizes what the AI offered.
Level 3, AI-Assisted Task Completion. A practice essay where AI co-drafts. The student then critiques and revises the AI draft, flagging where the AI got the Econ wrong or generic. The student is assessed on the quality of improvement they can make.
Level 4, Full AI. A case-study analysis where students use AI to summarize market data and then make and defend the Econ judgments, then use an AI to make a slide deck they then present on.
Level 5, AI Exploration. AI is used in novel ways to solve a problem, such as designing an AI workflow that writes economics essays, or building an AI grader that can accurately grade essays.

A related practical move is an appendix-disclosure: every assignment ends with a one-paragraph appendix naming what AI was used for and where the student chose not to use it. It takes ten minutes, and builds a culture of acceptance and trust around AI use.

Dawson again: the design-first principle does extra work here. Let's say you assign a homework quiz at the start of the next class and students do the reading at home knowing the in-class check is coming. The incentive is structural, and AI use is inherently discouraged unless it enhances learning.

Questions worth asking your department

For each task we assign, what cognitive rep is it producing? Where does the learning happen?
How does AI remove or change the friction that used to produce that rep? What could introduce new friction?
Where in the week is the teacher's irreplaceable role actually deployed — and what gets cut to make room for it?

Closing

I am just beginning to realize how much of this I do not have figured out. The students I see are the students I see. Your class will look different. The directions above are what I am working through, not what I am sure of.

Tell me what you think.

A note on method: this issue was produced through the co-creation workflow I'm advocating. The idea, the angle, the practitioner observations, the curated sources, and the final wording are mine. An AI assistant calibrated to my voice (through a guide of phrases I've approved and rejected) did the research legwork on sources I selected and drafted from an outline we agreed on.

References

Curtis, G. J. The two-lane road to hell is paved with good intentions. Higher Education Research & Development. 2025.
Perkins, M., Roe, J., & Furze, L. The AI Assessment Scale Revisited: A Framework for Educational Assessment. arXiv preprint. 2024.
Sidoti, O., Park, E., & Gottfried, J. About a quarter of U.S. teens have used ChatGPT for schoolwork — double the share in 2023. Pew Research Center. 2025.
Dawson, P., Bearman, M., Dollinger, M., & Boud, D. Validity matters more than cheating. Assessment & Evaluation in Higher Education, 49(7), 1005–1016. 2024.
Liu, D., & Bridgeman, A. Frequently asked questions about the two-lane approach to assessment in the age of AI. Teaching@Sydney, University of Sydney. 2024.
Cambridge Assessment International Education. The use of generative AI in coursework. Cambridge Exams Officers' Guide, Phase 3. n.d.
Eaton, S. E. 6 Tenets of Postplagiarism: Writing in the Age of Artificial Intelligence. Learning, Teaching and Leadership. 2023.