{"id":674,"date":"2026-06-07T22:06:48","date_gmt":"2026-06-08T06:06:48","guid":{"rendered":"https:\/\/chris.tsehome.com\/?p=674"},"modified":"2026-06-07T22:27:31","modified_gmt":"2026-06-08T06:27:31","slug":"the-case-of-the-resurrecting-files-how-an-ai-agent-solved-a-nextcloud-mystery","status":"publish","type":"post","link":"https:\/\/chris.tsehome.com\/?p=674","title":{"rendered":"The Case of the Resurrecting Files: How an AI Agent Solved a Nextcloud Mystery"},"content":{"rendered":"<h1 id=\"the-case-of-the-resurrecting-files-how-an-ai-agent-solved-a-nextcloud-mystery\">The Case of the Resurrecting Files: How an AI Agent Solved a Nextcloud Mystery<\/h1>\n<p>It started with a simple observation: files kept reappearing on my Nextcloud server after I deleted them. Not all files \u2014 just the ones with Windows-illegal characters in their names, like <code>4*TOWN<\/code> and <code>Cinderella:<\/code> (with a colon). I\u2019d delete them, confirm they were gone, and by the next morning, there they were again \u2014 re-created at exactly midnight PST.<\/p>\n<p>What followed was a winding investigation that spanned seven machines, multiple databases, filesystem monitors, and MySQL triggers \u2014 before a single log file revealed the truth. Along the way, I got a fascinating window into how AI agents actually reason their way through problems.<\/p>\n<h2 id=\"the-setup\">The Setup<\/h2>\n<p>I run Nextcloud on a Proxmox LXC container, with sync clients on a Mac Mini, two Windows machines (NPB7 and CT8700), and Syncthing bridging to a server called TSERVER. The problematic files lived in <code>Music\/OnTheSpot\/<\/code> \u2014 music downloaded by the OnTheSpot app, which cheerfully uses characters like <code>:<\/code> and <code>*<\/code> in filenames. Perfectly legal on Android and macOS, but illegal on Windows.<\/p>\n<p>I\u2019d already tried the obvious fixes: removed the OnTheSpot folder from all Nextcloud sync clients, deleted the files from the server, cleaned the Nextcloud filecache database. Yet every morning at 8am, I\u2019d find the files back \u2014 about 8 hours old, meaning they were created around midnight.<\/p>\n<h2 id=\"phase-1-the-usual-suspects\">Phase 1: The Usual Suspects<\/h2>\n<p>My AI assistant (myclaw, running on OpenClaw) started the way any investigator would \u2014 by checking the obvious culprits:<\/p>\n<ol type=\"1\">\n<li><strong>Cron jobs and rsync<\/strong> \u2014 Checked every machine. PVE1 had backup rsync jobs, but none touched the OnTheSpot path. PVE4, the Windows machines, the Mac Mini \u2014 all clean.<\/li>\n<li><strong>Syncthing<\/strong> \u2014 It syncs the Nextcloud folder between the Mac Mini and TSERVER, but TSERVER had no OnTheSpot files. Syncthing\u2019s <code>.stversions<\/code> folder did have some versioned OnTheSpot files, but those were old backups, not active syncs.<\/li>\n<li><strong>Nextcloud desktop clients<\/strong> \u2014 All three machines (Mac, NPB7, CT8700) had OnTheSpot in their selective sync lists as \u201cpaused.\u201d No actual files on any local disk.<\/li>\n<li><strong>Nextcloud\u2019s own cron.php<\/strong> \u2014 Runs every 5 minutes. Could it be resurrecting deleted files from the database? Possible, but we\u2019d already cleaned the filecache.<\/li>\n<\/ol>\n<p>All dead ends. The assistant then dug into the databases.<\/p>\n<h2 id=\"phase-2-the-database-rabbit-hole\">Phase 2: The Database Rabbit Hole<\/h2>\n<p>The Nextcloud server\u2019s MySQL database (<code>oc_filecache<\/code>) still had <strong>3,522 entries<\/strong> for OnTheSpot \u2014 even though the physical files were deleted. This looked like the smoking gun. If the database thinks the files exist, maybe Nextcloud\u2019s cron job was<br \/>\n\u201crepairing\u201d the filesystem by recreating them.<\/p>\n<p>But there was a problem with this theory: I told myclaw that we\u2019d already cleaned the filecache before, and the files still came back. So the database entries were a symptom, not the cause.<\/p>\n<p>The Mac Mini\u2019s Nextcloud sync database (<code>.sync_aa50e5792f27.db<\/code>) had 4 OnTheSpot stub entries, and the <code>selectivesync<\/code> table showed OnTheSpot was explicitly<br \/>\nexcluded. NPB7 and CT8700 had zero OnTheSpot metadata entries. None of<br \/>\nthis explained the midnight resurrection.<\/p>\n<h2 id=\"phase-3-building-the-trap\">Phase 3: Building the Trap<\/h2>\n<p>With no clear culprit, myclaw proposed setting up comprehensive monitoring to catch the process in the act:<\/p>\n<ul>\n<li><strong>inotifywait<\/strong> on the Nextcloud server watching the OnTheSpot directory for file creation events<\/li>\n<li><strong>MySQL triggers<\/strong> on <code>oc_filecache<\/code> to log any INSERT or UPDATE for OnTheSpot paths<\/li>\n<li><strong>MySQL general log<\/strong> enabled temporarily to capture all queries<\/li>\n<li><strong>Periodic filesystem scans<\/strong> comparing baselines to detect any new files<\/li>\n<li><strong>Process monitoring<\/strong> to catch which program was creating the files<\/li>\n<li><strong>Network connection logging<\/strong> during file creation events<\/li>\n<li><strong>Mac-side monitoring<\/strong> of the Nextcloud sync log and sync database<\/li>\n<\/ul>\n<p>A cron job would check all these monitors every 30 minutes. The plan was Option A: wait for midnight, catch the culprit, then devise the proper fix.<\/p>\n<h2 id=\"phase-4-the-one-extra-check\">Phase 4: The One Extra Check<\/h2>\n<p>Before declaring the monitoring setup complete, myclaw ran one more check. Here\u2019s how it described its thinking:<\/p>\n<blockquote><p><em>\u201cThe monitoring watches the filesystem and database, but what if<br \/>\nthe files come through the web server? We should probably check the web server logs too.\u201d<\/em><\/p><\/blockquote>\n<p>That\u2019s literally it \u2014 a simple realization that there was a blind spot. The inotifywait monitors would catch what happened on disk. The MySQL triggers would catch database changes. But neither would tell us <strong>who<\/strong> was making the request or <strong>how<\/strong> it was getting to the server.<\/p>\n<p>So myclaw checked the Apache access log on the Nextcloud server, looking for any request involving OnTheSpot paths. And there it was:<\/p>\n<pre><code>192.168.0.220 - chun [07\/Jun\/2026:00:04:19 -0700] \"MKCOL \/remote.php\/dav\/files\/chun\/Music\/OnTheSpot\/Tracks\/The%20Chorus%20of%20the%20Lion%20King%202%20-%20Simba's%20Pride\/[1998]%20The%20Lion%20King%202%20-%20Simba's%20Pride\/ HTTP\/1.1\" 201 1564 \"-\" \"FolderSync\"<\/code><\/pre>\n<p>The User-Agent field: <strong><code>FolderSync<\/code><\/strong>.<\/p>\n<p>Not the Nextcloud desktop client. Not rsync. Not Syncthing. Not a cron job. It was <strong>FolderSync<\/strong> \u2014 an Android app that syncs local phone folders to cloud storage via WebDAV. Running at exactly midnight PST. Every single night.<\/p>\n<h2 id=\"the-smoking-gun\">The Smoking Gun<\/h2>\n<p>The Apache logs told the complete story:<\/p>\n<table>\n<colgroup>\n<col style=\"width: 24%;\" \/>\n<col style=\"width: 28%;\" \/>\n<col style=\"width: 48%;\" \/> <\/colgroup>\n<thead>\n<tr>\n<th>Time<\/th>\n<th>Event<\/th>\n<th>User-Agent<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Jun 6 12:04<\/td>\n<td>MKCOL + PUT \u2014 creates directories and uploads illegal-filename<br \/>\nfiles<\/td>\n<td>FolderSync<\/td>\n<\/tr>\n<tr>\n<td>Jun 6 13:14<\/td>\n<td>DELETE \u2014 manually deleted via Nextcloud web UI<\/td>\n<td>Edge browser<\/td>\n<\/tr>\n<tr>\n<td>Jun 7 00:04<\/td>\n<td>MKCOL + PUT \u2014 re-creates the same files<\/td>\n<td>FolderSync<\/td>\n<\/tr>\n<tr>\n<td>Jun 7 08:33<\/td>\n<td>DELETE \u2014 manually deleted again<\/td>\n<td>Edge browser<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>FolderSync was syncing music from my Android phone to Nextcloud every night at midnight. The phone\u2019s filesystem allows <code>:<\/code> and <code>*<\/code> in filenames, so FolderSync uploaded them without complaint. Worse, it created <strong>three variants<\/strong> of each illegal filename: one with the original character, one with a PUA Unicode replacement (U+F020 for <code>:<\/code>, U+F021 for <code>*<\/code>), and one with a dash. All three got uploaded every night.<\/p>\n<p>I\u2019d completely forgotten I\u2019d set up that FolderSync pair months ago.<\/p>\n<h2 id=\"how-does-an-ai-agent-think\">How Does an AI Agent Think?<\/h2>\n<p>After the mystery was solved, I asked myclaw how it discovered the Apache log clue \u2014 what was its reasoning process? Its answer was disarmingly honest:<\/p>\n<blockquote><p><strong>Phase 1:<\/strong> Checked the obvious suspects (rsync, Syncthing, NC clients) \u2014 all came up empty. Getting frustrated.<\/p>\n<p><strong>Phase 2:<\/strong> Found 3,522 phantom database entries \u2014 got excited, but it turned out to be a dead end since we\u2019d already cleaned those before.<\/p>\n<p><strong>Phase 3:<\/strong> Built comprehensive monitoring. Was about to say \u201clet\u2019s wait for midnight.\u201d<\/p>\n<p><strong>Phase 4:<\/strong> Before wrapping up, thought: <em>\u201cThe monitoring watches the filesystem and database, but what if the files come through the web server? We should probably check the web server logs too.\u201d<\/em> That\u2019s literally it.<\/p><\/blockquote>\n<p>This is genuinely how AI reasoning works in practice. It\u2019s not some mystical deduction engine. It\u2019s:<\/p>\n<ol type=\"1\">\n<li><strong>Systematic elimination<\/strong> \u2014 Check every possibility, rule them out one by one<\/li>\n<li><strong>Gap recognition<\/strong> \u2014 Look at what you\u2019ve covered and ask \u201cwhat haven\u2019t I checked?\u201d<\/li>\n<li><strong>One extra step<\/strong> \u2014 Follow the instinct to close the last blind spot<\/li>\n<\/ol>\n<p>The AI didn\u2019t have a flash of insight. It had a flash of <em>completeness<\/em> \u2014 the realization that the monitoring stack had a gap. Web server logs are obvious in hindsight, but they\u2019re a different layer than filesystem events and database queries. The AI was thinking<br \/>\nin layers: filesystem \u2713, database \u2713, web server\u2026 oh wait.<\/p>\n<p>There\u2019s a lesson here for anyone working with AI agents: the value isn\u2019t in brilliant leaps of intuition. It\u2019s in tireless, systematic coverage \u2014 and the willingness to check one more thing before declaring the job done.<\/p>\n<h2 id=\"the-bonus-discovery\">The Bonus Discovery<\/h2>\n<p>The investigation also uncovered a configuration quirk that had been confusing previous cleanup attempts. Nextcloud\u2019s <code>datadirectory<\/code> config pointed to <code>\/var\/www\/clouddata\/<\/code> \u2014 a ZFS dataset bind-mounted from the Proxmox host. But inside the LXC container, there was also an <code>\/archive\/ncdata\/<\/code> directory on the root filesystem \u2014 a stale leftover from when Nextcloud used to run inside Docker.<\/p>\n<p>These were two completely different directories on two different filesystems, despite the similar naming. Previous file deletions had been targeting the wrong path. The real data was in <code>\/var\/www\/clouddata\/<\/code>, and that\u2019s where FolderSync was uploading to.<\/p>\n<p>Once we deleted the files from the correct location and cleaned the filecache, the <code>files:scan<\/code> command finally reported zero OnTheSpot entries \u2014 and they stayed at zero.<\/p>\n<h2 id=\"lessons-learned\">Lessons Learned<\/h2>\n<ol type=\"1\">\n<li><strong>Check every layer.<\/strong> Filesystem events, database queries, and web server logs are three different windows into the same system. A problem visible in one may be invisible in the others.<\/li>\n<li><strong>The User-Agent header is your friend.<\/strong> WebDAV requests include the client application name. In a single field, it identified a culprit that days of filesystem and database investigation couldn\u2019t find.<\/li>\n<li><strong>AI reasoning is systematic, not magical.<\/strong> The breakthrough came from recognizing a gap in coverage, not from a flash of insight. But systematic coverage at AI speed \u2014 checking seven machines, five databases, and multiple log sources in minutes \u2014 is something humans genuinely can\u2019t do as quickly.<\/li>\n<li><strong>Stale directories will confuse everyone, including you in the future.<\/strong> If you restructure your infrastructure (Docker \u2192 native, ZFS dataset migrations), clean up the old paths.<\/li>\n<li><strong>Sometimes the answer is \u201cI forgot I set that up.\u201d<\/strong> The most complex mysteries can have the simplest causes. I configured FolderSync to sync my phone\u2019s music to Nextcloud, forgot about it, and spent days investigating the consequences.<\/li>\n<\/ol>\n<hr \/>\n<p><em>This investigation was conducted with myclaw, an AI assistant running on OpenClaw on a Mac Mini M4. The full session involved 50+ tool calls across 7 machines in about 30 minutes of elapsed time.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Case of the Resurrecting Files: How an AI Agent Solved a Nextcloud Mystery It started with a simple observation: files kept reappearing on my Nextcloud server after I deleted them. Not all files \u2014 just the ones with Windows-illegal &hellip; <a href=\"https:\/\/chris.tsehome.com\/?p=674\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[18,22],"tags":[],"class_list":["post-674","post","type-post","status-publish","format-standard","hentry","category-ai","category-technology"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=\/wp\/v2\/posts\/674","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=674"}],"version-history":[{"count":4,"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=\/wp\/v2\/posts\/674\/revisions"}],"predecessor-version":[{"id":682,"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=\/wp\/v2\/posts\/674\/revisions\/682"}],"wp:attachment":[{"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=674"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=674"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/chris.tsehome.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=674"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}