WordPress security audit checklist 2026: The Database steps most guides skip

Database Security

Why a Clean File Scan Is Not the Same as a Clean Site

Wordfence returned a clean report. Sucuri flagged nothing. And yet Google Search Console is indexing pages filled with Japanese casino keywords, pharmaceutical product names, or anchor text you never wrote. This coexistence – a passing file scan alongside active SEO spam indexed in your name – is not a scanner failure; it is a scope boundary. File-based security tools inspect PHP and JavaScript on disk. Database payloads live in wp_options, wp_posts, wp_postmeta, and wp_termmeta rows, where file scanners cannot reach. A payload stored as a base64-encoded autoload option, or injected into post metadata as a serialized link array, or nested three levels deep inside a Gutenberg block’s innerBlocks JSON, will not appear in any file-layer scan – not because the tools are flawed, but because the database is simply outside their scope. This WordPress security audit checklist is the complementary step: five SELECT queries you can run in phpMyAdmin right now, organized by symptom, to audit exactly the layer your current tooling leaves uncovered.

The “Files Clean, Site Still Infected” Pattern

The symptom pattern is recognizable. Google Search Console flags indexed pages with Japanese keyword clusters – casino, loan, or pharmaceutical terms – on a site that returned a clean file scan. Or you spot pharma product links in the HTML source of a legitimate post but find nothing when searching the WordPress editor. Or a client reports a manual action for “unnatural links” on a site you personally audited three weeks ago. In each case, the injected content lives in the database, not on disk.

The Japanese keyword hack injects auto-generated spammy content directly into database tables – including wp_posts and wp_options – meaning a clean file scan does not rule it out. It also uses cloaking: human visitors and logged-in admins see a normal page or a 404, while search engine bots receive the full spam page. You can audit your own site for an hour and see nothing, while Google’s index accumulates thousands of spam pages in your domain’s name. A direct verification step: fetch one of the flagged GSC URLs using a Googlebot user-agent and compare the response body against what you see when logged in. If the responses differ materially, you have confirmed cloaking. The exact curl command is in the cloaking section below.

What Changed for 2025–2026

Sucuri alone observed over 500,000 websites that became infected in 2024, according to Patchstack’s State of WordPress Security 2025 report – and that figure represents only the sites visible to one provider. What has sharpened the database-layer risk specifically is the expansion of the Gutenberg plugin ecosystem as an initial-access vector. CVE-2024-9234 and CVE-2024-9707 in the GutenKit plugin (CVSS 9.8) allowed unauthenticated attackers to install and activate arbitrary plugins, establishing persistent footholds with no obvious file-layer indicators. In August 2024, Wordfence disclosed a privilege escalation in Post Grid and Gutenberg Blocks affecting over 40,000 active installations. Both vulnerability classes deposit payloads into the database, not into files.

Some security plugins do include database scanning features, and that is worth acknowledging. Wordfence and MalCare both scan portions of the database. Where the coverage tends to be thin in practice is depth: autoload rows in wp_options, payloads serialized inside page-builder metadata, and content nested inside Gutenberg’s innerBlocks JSON structures are not the inspection targets those tools were built around. The rest of this checklist covers exactly that layer – not as a replacement for the file scanner you already run, but as the complementary step your current tooling leaves unaddressed.

The Five Tables Attackers Actually Use (And What Lives in Each)

Treating “the database” as one thing to inspect is what makes database-layer audits feel overwhelming. In practice, the majority of database-resident payloads concentrate in five core WordPress tables, each with a distinct payload shape. Working through them one at a time turns an opaque problem into a structured WordPress security audit checklist 2026.

`wp_options` and the Autoload Persistence Trick

Every WordPress page load begins with a bulk query that fetches all rows from wp_options where autoload = 'yes' – before your theme renders, before your plugins fire, before anything else runs. Attackers exploit this by writing their payload into an autoload row, which means the malicious content is loaded on every page request without a single PHP file ever being modified. Because autoload payloads live in database rows – not files – they are unaffected by password resets or plugin reinstalls. The payload continues to load on every page request until the row itself is removed.

The fingerprints are surprisingly direct. During Japanese keyword hack intrusions, investigators have found rows with option_name values like base64_code – a documented indicator – sitting alongside option_value fields containing long base64-encoded strings where you would expect a serialized array or a plain settings string. Patchstack’s 2024 research documented the tagDiv Composer Stored XSS as a real-world instance: attackers injected malicious code into tagDiv plugin options stored as wp_options rows, with the payload living entirely in the database and requiring no file write. If an option_name is not something your active plugins would have written, and its option_value is a wall of base64, that row warrants investigation before anything else does.

`wp_posts` and `wp_postmeta` — Where SEO Spam Actually Lives

The Japanese keyword hack injects auto-generated spam content directly into wp_posts.post_content – casino terms, loan keywords, pharmaceutical product names – sometimes as hidden <div> elements with display:none, sometimes as full auto-generated pages with their own slugs and published status. The MalCure remediation walkthrough documents the same pattern: spam posts exist alongside legitimate content and are indexed by search engines while remaining invisible to anyone browsing the site as a human, because the cloaking layer serves them only to bots. Filtering the WordPress post list by status will often not surface them — attackers publish under plausible post types or set non-standard statuses.

The Gutenberg block format adds another layer of inspection complexity. Block-editor posts store content as HTML-comment-delimited JSON – ... – and blocks can nest arbitrarily deep inside innerBlocks arrays. A script payload inserted two or three levels into that nesting may not surface in a flat LIKE '%<script%' query against post_content. Recursive parsing of the block tree is the reliable approach for Gutenberg-rendered posts; the WP-CLI starting point is in the cloaking section.

Per Sucuri’s database cleanup guidance, wp_postmeta is among the tables most commonly found to carry malicious content in WordPress incidents – yet it rarely appears in published audit checklists. The pattern here is hidden external link arrays or redirect rules stored as serialized PHP in meta_value rows attached to legitimate posts. Because the post itself looks clean in the editor, and because wp_postmeta is not rendered visibly in the admin UI, these rows can sit undetected for months.

`wp_users` and `wp_termmeta` — The Tables Nobody Checks

A database audit following a Japanese keyword hack should include reviewing wp_users for unknown administrator accounts that may have been created as part of the intrusion. These accounts typically have innocuous-looking usernames, no associated posts, and a registration timestamp that postdates your legitimate users. The danger is not only re-entry – cleanup efforts focused on posts and options leave an active admin credential in place, making reinfection straightforward.

wp_termmeta – the metadata table for taxonomy terms like categories and tags – is the least-scrutinized surface in a standard audit, which is why it gets used as a payload dump. Attackers attach encoded content to term IDs that nobody will think to inspect, because the visible category page appears completely normal. The content is never rendered to users directly; it serves as storage for a payload that something else in the request cycle retrieves and executes.

The SELECT-First Audit: Safe Queries You Can Run Right Now

Before a single query runs, export a full database backup. Via WP-CLI: wp db export backup-$(date +%F).sql. Via phpMyAdmin: Export tab → Custom → select all tables → gzip → Go. Every query in this section is SELECT-only — it reads rows, changes nothing. If you later decide a row needs to be removed, you do that against a confirmed-good backup, not the live database. That sequencing is the safety model.

Sucuri’s database cleanup methodology centers on targeted SQL searches across wp_posts, wp_postmeta, wp_options, and wp_users via phpMyAdmin or Adminer – not on running a plugin and waiting for a verdict. The queries below follow the same logic, organized by the symptom that brought you here.

Triage Order by Symptom

Start with the table most likely to surface your specific symptom, then work outward. If Google Search Console is flagging indexed Japanese or casino-term URLs, open with wp_posts and wp_options – the Japanese keyword hack deposits spam content in both. If you are seeing pharmaceutical product names in the HTML source of a legitimate post, check wp_posts first, then wp_postmeta for hidden link arrays attached to that post. The pharma hack pattern documented by NOC.org injects pharmaceutical product references into existing pages and uses cloaking, so the post body looks clean in the editor while the rendered response carries the spam. Unexplained redirects that appeared without a plugin change point toward wp_options autoload rows. If you have already cleaned a site once and symptoms have returned, go directly to wp_options and wp_users: the autoload row you missed, or the rogue administrator account that re-established access, is the most common explanation for reinfection within days of a cleanup.

The Five Queries

Run each of these in your phpMyAdmin SQL tab or Adminer query runner. Adjust the table prefix if yours is not wp_.

wp_options autoload sweep – targets eval, base64_decode, and suspicious option names in autoload rows:

SELECT option_id, option_name, LEFT(option_value, 200)
FROM wp_options
WHERE autoload = 'yes'
  AND (
    option_value LIKE '%base64_decode%'
    OR option_value LIKE '%eval(%'
    OR option_name LIKE '%base64%'
  );

Look for option_name values like base64_code – a documented Japanese keyword hack indicator – or long encoded strings in option_value where a serialized settings array would normally appear.

wp_posts script and spam content check – surfaces injected script tags, hidden divs, and pharma or Japanese keyword content:

SELECT ID, post_title, post_status
FROM wp_posts
WHERE post_content LIKE '%<script%'
   OR post_content LIKE '%display:none%'
   OR post_content LIKE '%viagra%'
   OR post_content LIKE '%カジノ%';

wp_postmeta hidden link sweep – catches external link arrays and encoded payloads in metadata attached to otherwise-clean posts:

SELECT meta_id, post_id, meta_key, LEFT(meta_value, 300)
FROM wp_postmeta
WHERE meta_value LIKE '%<a href%'
   OR meta_value LIKE '%base64_decode%'
   OR meta_value LIKE '%eval(%';

wp_users rogue administrator check – lists every account carrying the administrator capability so you can verify each one belongs there:

SELECT u.ID, u.user_login, u.user_registered
FROM wp_users u
JOIN wp_usermeta m ON u.ID = m.user_id
WHERE m.meta_key = 'wp_capabilities'
  AND m.meta_value LIKE '%administrator%';

wp_termmeta payload check – the table most audits skip entirely:

SELECT *
FROM wp_termmeta
WHERE meta_value LIKE '%<script%'
   OR meta_value LIKE '%base64_decode%';

Reading the Results Without False-Positive Fatigue

A match is a lead, not a verdict. A <script> tag in post_content may be a legitimate analytics embed; a base64 string in wp_options may be a plugin’s encoded license token. The decision rule: if you cannot immediately name which plugin or theme wrote this row, and the matched pattern is suspicious, record the option_id, meta_id, or ID and investigate its origin before touching anything. Check when the row was last modified against your deployment history; look up the option_name or meta_key against the plugin that would legitimately own it.

Three findings warrant immediate escalation beyond query-and-investigate: a wp_users administrator account you cannot account for, an autoload row containing executable PHP that resolves on page load, or a confirmed difference between what Googlebot receives and what you see logged in. Any one of those means you are dealing with an active compromise, and the cleanup scope extends past a few rows — your file scanner’s output becomes the next thing to cross-reference, because the database payload and the initial-access vector often point to each other.

The Layers That Make Database Malware Hard to See (and What to Do About It)

If you ran those five queries and found something you cannot explain, the natural next question is why you missed it on every previous review of the same site. The answer is structural – three properties of database-resident payloads work against the kind of review a site owner or agency typically performs.

Cloaking – and How to See What Googlebot Sees

The Japanese keyword hack and the pharma hack both rely on cloaking: injected content is served to search engine bots while human visitors and logged-in admins see a normal page or a 404. This is a hard conditional in the payload’s delivery logic that checks the incoming user-agent and returns entirely different content depending on whether the request looks like Googlebot or a human. The 2026 cleanup categorization by Fysal Yaqoob treats SEO spam and database injections as distinct active threat categories for exactly this reason — the payload is invisible to the site owner by design.

The verification step is a direct user-agent fetch. Run this against any URL flagged in Google Search Console:

curl -A "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 
  https://yoursite.com/suspicious-path/

Compare the response body against what you see when logged in. If the two responses differ materially – extra content, different title tags, injected links – you have confirmed cloaking and can treat the investigation as an active incident. Browser-based alternative: Chrome DevTools → More tools → Network conditions → User agent → Custom → paste the Googlebot string, then reload.

Autoload Persistence in Plain Language

The autoload mechanism is why a compromised site can look fully clean after a thorough remediation attempt and then revert within days. When an attacker writes a payload into an autoload row, resetting passwords does not touch it. Reinstalling plugins does not touch it. Switching themes does not touch it. None of those operations run a database query that would remove or overwrite the row, so the payload continues to load on every visit until the specific row is identified and removed. This is the most common explanation for a site that appears clean for a few days after cleanup and then redisplays symptoms — the row was never found in the first place.

The Gutenberg Block-Data Gap

A flat LIKE '%<script%' query against wp_posts reads the raw string — but a payload injected three levels into a nested block structure may not surface cleanly depending on how the row is stored and truncated in your query output. Recursive parsing is the reliable approach: export post_content for the posts flagged in your earlier query, then walk the block tree programmatically.

A WP-CLI starting point: list post IDs, then pull each post_content field and pipe it into a small PHP block-parser script that walks the innerBlocks arrays recursively and flags any block whose serialized attributes contain eval(, base64_decode, or <script:

wp post list --post_type=post --field=ID | 
  xargs -I {} wp post get {} --field=post_content

The reason this gap exists at the plugin level is tied to how Gutenberg-adjacent vulnerabilities work as initial-access vectors — the GutenKit and Post Grid disclosures referenced earlier both deposit footholds in the database rather than in PHP files on disk. Understanding the database malware blind spot that file-layer tools leave uncovered is the context that makes these findings legible.

Run the five SELECT queries from the previous section against your highest-priority client site this week. If anything surfaces that you cannot attribute to a known plugin or content entry, your file scanner is still the right starting point for confirming whether the initial-access vector left file-layer artifacts – and the database queries are how you establish whether the compromise extended into the rows where it tends to outlast everything else.

Your Next Action: Run the Five Queries Against Your Highest-Risk Site

Before you close this article, open phpMyAdmin or Adminer for the WordPress installation you came here to audit. Export a full database backup using wp db export backup-$(date +%F).sql via WP-CLI, or the Export tab in phpMyAdmin. Then run the five SELECT queries from the “The Five Queries” section — the wp_options autoload sweep, wp_posts script check, wp_postmeta link sweep, wp_users administrator check, and wp_termmeta payload check – in order. Record any match you cannot immediately attribute to a known plugin or content update.

If your results are clean, document that fact and integrate the query set into your audit checklist going forward — this is your new baseline. If anything surfaces that matches a suspicious pattern, cross-reference the modification timestamp against your deployment history, then treat confirmed-cloaked content (verified via the Googlebot user-agent curl command) as an active incident requiring escalation to your file scanner’s output and access-log review. The database payload and the initial-access vector usually point to each other.

WordPress security audit checklist 2026: The Database steps most guides skip

Why a Clean File Scan Is Not the Same as a Clean Site

The “Files Clean, Site Still Infected” Pattern

What Changed for 2025–2026

The Five Tables Attackers Actually Use (And What Lives in Each)

`wp_options` and the Autoload Persistence Trick

`wp_posts` and `wp_postmeta` — Where SEO Spam Actually Lives

`wp_users` and `wp_termmeta` — The Tables Nobody Checks

The SELECT-First Audit: Safe Queries You Can Run Right Now

Triage Order by Symptom

The Five Queries

Reading the Results Without False-Positive Fatigue

The Layers That Make Database Malware Hard to See (and What to Do About It)

Cloaking – and How to See What Googlebot Sees

Autoload Persistence in Plain Language

The Gutenberg Block-Data Gap

Your Next Action: Run the Five Queries Against Your Highest-Risk Site

Get security tips in your inbox.

Popular Posts

Categories

WordPress security audit checklist 2026: The Database steps most guides skip

Why a Clean File Scan Is Not the Same as a Clean Site

The “Files Clean, Site Still Infected” Pattern

What Changed for 2025–2026

The Five Tables Attackers Actually Use (And What Lives in Each)

wp_options and the Autoload Persistence Trick

wp_posts and wp_postmeta — Where SEO Spam Actually Lives

wp_users and wp_termmeta — The Tables Nobody Checks

The SELECT-First Audit: Safe Queries You Can Run Right Now

Triage Order by Symptom

The Five Queries

Reading the Results Without False-Positive Fatigue

The Layers That Make Database Malware Hard to See (and What to Do About It)

Cloaking – and How to See What Googlebot Sees

Autoload Persistence in Plain Language

The Gutenberg Block-Data Gap

Your Next Action: Run the Five Queries Against Your Highest-Risk Site

Get security tips in your inbox.

Popular Posts

Categories

`wp_options` and the Autoload Persistence Trick

`wp_posts` and `wp_postmeta` — Where SEO Spam Actually Lives

`wp_users` and `wp_termmeta` — The Tables Nobody Checks