Uncategorized

Internal Controls Testing in 2026: Design vs Operating Effectiveness, Sample Sizes, and the AICPA Methodology

Internal controls testing is the systematic evaluation of whether a company’s controls are designed to prevent or detect material misstatements and whether they actually operated effectively during the audit period. In a 2026 audit, this work sits at the intersection of SAS 145 for private-company financial statement audits, PCAOB AS 2201 for integrated audits of public companies, and the COSO 2013 Internal Control Integrated Framework that underpins both. This guide explains the conceptual distinction between design and operating effectiveness, the sample sizes auditors use by control frequency, the three categories of controls auditors evaluate, and the most common deficiency findings that show up in real-world audit reports.

Key takeaways

  • Design effectiveness asks whether a control, if it operates as described, would prevent or detect a material misstatement. Operating effectiveness asks whether the control actually operated as designed throughout the period.
  • SAS 145 (AU-C 315 revised, effective for audits of periods ending on or after December 15, 2023) reshaped how private-company auditors identify and assess risks of material misstatement, including the inherent risk and control risk decomposition.
  • PCAOB AS 2201 governs integrated audits of internal control over financial reporting (ICFR) for public companies, requiring a top-down, risk-based approach starting from entity-level controls.
  • Sample sizes scale with control frequency: 1 for annual controls, 2 for quarterly, 2 to 5 for monthly, 5 to 15 for weekly, and 15 to 25 for daily or multiple-daily controls under the AICPA Audit Guide on sampling considerations.
  • The four deficiencies that appear most often in audit reports are segregation of duties gaps, lack of manual journal entry review, weak user access controls, and undocumented change management.

What is internal controls testing?

Internal controls testing is the auditor’s evaluation of whether a company’s controls over financial reporting are both well-designed and operating effectively. The work falls into two conceptual stages: design assessment and operating effectiveness testing. The AICPA and PCAOB are explicit that an auditor cannot conclude on operating effectiveness without first concluding on design.

The framework underneath the work is COSO 2013, the Internal Control Integrated Framework. It organizes internal control into 5 components (control environment, risk assessment, control activities, information and communication, monitoring) and 17 principles. Every control an auditor tests maps to one of the 17 principles, and both SAS 145 and AS 2201 reference COSO 2013.

The standards driving the work in 2026: SAS 145 (AU-C 315 revised), effective for audits of periods ending on or after December 15, 2023; PCAOB AS 2201, “An Audit of Internal Control Over Financial Reporting That Is Integrated with an Audit of Financial Statements”; and PCAOB AS 1000, “General Responsibilities of the Auditor in Conducting an Audit,” effective for fiscal years beginning on or after December 15, 2024.

Why internal controls testing matters

The audit reason is risk assessment. Under SAS 145, the auditor must separately identify inherent risk and control risk for each significant account and assertion. To assess control risk below maximum, the auditor must test controls. Without testing, the auditor defaults to substantive procedures across the population, which costs more and yields less precise results.

The compliance reason is Section 404. Public companies subject to Section 404(b) of the Sarbanes-Oxley Act must have their auditor opine on the effectiveness of ICFR. AS 2201 sets the standard for that opinion. Material weaknesses must be disclosed, trigger remediation, and frequently move stock prices.

The operational reason is internal use. Even at private companies not subject to Section 404, controls testing surfaces process gaps that cost money: duplicate vendor payments, unauthorized journal entries, IT access persisting after terminations. The ACFE 2024 Report to the Nations found median fraud loss at organizations without anti-fraud controls runs roughly 2x the loss at organizations with them.

What is the difference between design and operating effectiveness?

This distinction is the single most important concept in controls testing, and the source of most first-year auditor confusion. The two effectiveness questions are sequential, conceptually different, and evidenced differently.

Design effectiveness. The auditor asks: if this control operates as described, would it prevent or detect a material misstatement in the relevant assertion? The evidence is typically a walkthrough. The auditor follows a transaction or event from initiation through recording, observing the control points along the way, and concluding whether the controls as configured would catch a misstatement. A control can be well-designed even if no one ran it. PCAOB AS 2201 paragraph 42 and AICPA AU-C 315 paragraph .26 are explicit: design effectiveness is concluded first, and only controls that are well-designed are tested for operating effectiveness.

Operating effectiveness. The auditor asks: did this control actually operate as designed throughout the period? The evidence is testing of a sample of instances. For a monthly control, the auditor selects a sample of months and inspects evidence that the control operated. For a daily control, the auditor selects a sample of days. The operating effectiveness conclusion covers the period, not a point in time, and requires evidence that the control ran with the same effectiveness throughout.

A worked example. A company has a control: “The Controller reviews monthly bank reconciliations for all bank accounts and documents review with a signature and date.” Design effectiveness: the auditor walks through one instance, confirms the bank-account population is complete, confirms the Controller is independent of the preparer, and concludes the control, as designed, would detect a material reconciling item. Operating effectiveness: the auditor samples 3 monthly reconciliations across the year, inspects each for signature, date, and follow-up on reconciling items. If all 3 pass, operating effectiveness is concluded. If 1 of 3 lacks evidence, the auditor expands the sample and may conclude the control is not operating effectively.

How does the top-down, risk-based approach work under AS 2201 and SAS 145?

Both PCAOB AS 2201 and SAS 145 require auditors to use a top-down approach: start at the entity level, move to significant accounts and disclosures, identify relevant assertions, and select controls that address the risks at the assertion level. The intent is to focus testing on controls that matter and avoid wasting effort on controls that do not affect material misstatement risk.

Step 1 identifies entity-level controls, including the control environment, the period-end financial reporting process, controls over management override, and the company’s risk assessment process. Strong entity-level controls allow less detailed testing at the process level; weak ones force more.

Step 2 identifies significant accounts and disclosures based on quantitative materiality and qualitative factors (susceptibility to fraud, complexity, related-party involvement, recent changes).

Step 3 identifies relevant assertions for each significant account. Revenue typically has existence and cutoff as primary relevant assertions; accounts payable typically has completeness as primary.

Step 4 selects controls that address those relevant assertions. SAS 145 added an explicit requirement to consider both manual and automated controls and to evaluate IT general controls (ITGCs) over the IT systems that produce financial information.

What sample sizes do auditors use for controls testing?

Sample sizes follow conventions from the AICPA Audit Guide on Audit Sampling and firm methodologies. The convention scales with control frequency: more frequent controls require larger samples because the population is larger.

The standard 2026 sample size table for controls assessed at low risk:

Sample sizes increase when assessed risk is higher. A control over revenue recognition or management estimates draws a larger sample than a control over fixed asset depreciation. Samples also increase when the prior-year deviation rate is non-zero: a control with 1 deviation in a prior sample of 3 typically moves to 5 or 7 in the current year.

Automated controls follow a different convention. An automated control that runs the same way each time is typically tested with a sample of 1 plus a test of the IT general controls that ensure the application and configuration cannot change without authorization. This is the “test of one” approach.

The three categories of controls (comparison)

Auditors organize the control universe into three categories. Each category is tested differently and produces different forms of evidence. Understanding the category determines the sample size, the nature of testing, and the audit response when a deficiency is identified.

Category Definition Examples How it is tested
Entity-level controls Pervasive controls operating at the level of the organization as a whole. COSO 2013 Component 1 (Control Environment) and Component 2 (Risk Assessment). Code of conduct, board oversight, whistleblower hotline, period-end close oversight, management override controls. Inquiry, observation, document inspection. Sample size typically 1 to 3 per control.
Process-level controls Controls operating within a specific business process or transaction cycle. Three-way match in procurement, sales order credit limit check, monthly bank reconciliation, manual journal entry review. Walkthrough for design, sample testing for operating effectiveness using the frequency-based sample sizes.
IT general controls (ITGCs) Controls over the IT environment that allow application controls to be relied on. User access provisioning and review, change management, computer operations (backups, batch jobs), physical and logical security. Walkthrough for design, sample testing for operating effectiveness, supplemented by direct inspection of system configurations.

The relationship between the three categories matters for testing strategy. Weak ITGCs undermine reliance on any automated process-level control. Weak entity-level controls (for example, a poor control environment) increase the residual risk that even well-designed and operated process-level controls will fail at audit-relevant moments. The auditor’s evaluation flows from entity to process to IT, and gaps at higher levels expand testing at lower levels.

Design effectiveness vs operating effectiveness (comparison)

The single most-asked question in first-year audit training. The table below summarizes the conceptual, evidentiary, and timing distinctions in one view.

Dimension Design effectiveness Operating effectiveness
Definition Whether the control, if it operates as described, would prevent or detect a material misstatement. Whether the control actually operated as designed throughout the audit period.
Evidence required Walkthrough of one transaction, inquiry of personnel, observation of the control in operation, inspection of supporting documentation. Sample of control instances across the audit period, inspection of evidence that the control operated for each instance.
Sample size Typically 1 walkthrough per control. Frequency-based: 1 for annual, 2 for quarterly, 2 to 5 for monthly, 5 to 15 for weekly, 15 to 25 for daily.
Typical timing Interim work, often performed 3 to 6 months before period end. Interim work covering 9 months plus roll-forward work covering the final 3 months.
Conclusion order Concluded first. Control must be well-designed before operating effectiveness can be tested. Concluded after design effectiveness. Skipped if design is ineffective.
Standard reference AU-C 315 (SAS 145) paragraph .26; PCAOB AS 2201 paragraph 42. AU-C 330 paragraphs .08 through .11; PCAOB AS 2201 paragraphs 44 through 47.

The practical implication. If design fails, operating effectiveness testing stops and the auditor reports a control deficiency. If design passes and operating effectiveness fails, the auditor reports a control deficiency tied to operation. The severity of the deficiency (control deficiency, significant deficiency, or material weakness) depends on the magnitude and likelihood of misstatement that could result, evaluated under PCAOB AS 2201 paragraphs 62 through 70 or the equivalent AICPA guidance.

Recent changes affecting internal controls testing

Three standard-setter developments shape the work in 2026.

SAS 145. Effective for audits of periods ending on or after December 15, 2023, SAS 145 superseded SAS 122 (AU-C 315). Material changes: explicit separation of inherent risk and control risk, an expanded definition of significant risk, new requirements for understanding the entity’s use of IT, and enhanced documentation. The standard is now embedded in firm methodologies, but the documentation burden remains heavy.

PCAOB AS 1000. Effective for fiscal years beginning on or after December 15, 2024, AS 1000 consolidated several legacy standards into a single standard covering the general responsibilities of the auditor. It reinforced professional skepticism, engagement partner direction and supervision, and documentation of judgments. See our explainer on PCAOB AS 1000 general responsibilities for the full scope.

Audit data analytics. Major firms have invested in audit data analytics platforms (KPMG Clara, Deloitte Omnia, EY Helix, PwC Aura, BDO Atlas, RSM Advance) that ingest full populations rather than samples. For automated controls and certain process-level controls, 2026 audit responses increasingly rely on 100% population analytics. The evidence quality rises but data completeness, source system reliability, and GL reconciliation introduce new procedural requirements.

The four most common deficiency findings

The same control deficiencies appear in audit comment letters and Section 404 disclosures year after year. The PCAOB inspection reports for the 2023 and 2024 inspection cycles, the SEC’s filings on material weakness disclosures, and the AICPA Peer Review program findings all converge on the same short list.

Segregation of duties gaps. The classic finding. The same person initiates a vendor master change and approves the resulting payment. The same person prepares and posts manual journal entries without independent review. The same person performs the bank reconciliation and has authority to release wire transfers. Mid-market companies under $250 million in revenue carry segregation of duties findings at rates well above the public-company average because the underlying constraint is staffing.

Manual journal entry review. Post-Sarbanes-Oxley, auditors have focused on manual journal entries as the most common vector for management override. The deficiency: entries posted without independent review, with review performed but not documented, or with a reviewer lacking the independence or competence to detect a misstatement. The fix is a documented manual journal entry log with reviewer signature, date, and explanation of unusual entries.

User access controls. Three sub-findings recur: terminated users retain access beyond the termination date, periodic access reviews are not performed or not documented, and role-level segregation is not enforced. User access deficiencies undermine reliance on any automated control in the same system.

Change management. Code or configuration changes to financially significant applications go to production without documented approval, testing evidence, or segregation between developer and deployer. The fix is a documented workflow (Jira, ServiceNow, or similar) with mandatory approval gates and traceable links to test and deployment records.

Common pitfalls in controls testing

Beyond the deficiencies, several procedural mistakes recur.

Walking through the wrong instance. The walkthrough must cover a representative instance. Walking through the first transaction of the year, when staff are still ramping into the new process, can mislead.

Confusing process documentation with control documentation. Process flowcharts describe how transactions move; they do not document controls. Controls are specific: the action that prevents or detects the misstatement, performed by a specific person or system, at a specific point.

Skipping the design conclusion. Audit teams sometimes jump straight to sampling operating effectiveness. If the auditor later discovers a design flaw (for example, the reviewer is not independent of the preparer), all the operating effectiveness work has to be reperformed.

Underestimating ITGCs. A clean process-level audit gets undone by ITGC deficiencies that block reliance on automated controls. Plan ITGC scope at the start, not in fieldwork.

Misclassifying deficiency severity. AS 2201 requires evaluation of magnitude and likelihood. Magnitude at or above materiality with more-than-remote likelihood produces a material weakness; magnitude below material with at least more-than-remote likelihood produces a significant deficiency. Under-reporting is the more common and more dangerous error.

For context on the overlapping attestation work that uses much of the same control evidence base, see our companion guide on SOC 2 audits, which walks through the Trust Services Criteria, Type 1 versus Type 2 reports, and how auditors test the security and availability controls that overlap with ICFR. For the regulatory landscape underneath audit work in 2026, see our coverage in the regulatory section. For learning resources on the underlying frameworks, see our learn section. And for a market view of the audit firms competing to perform this work, including the rise of PE-backed mid-tier firms, see our analysis of the best PE-backed audit firms.

Frequently asked questions

What is the difference between a control deficiency, a significant deficiency, and a material weakness?
A control deficiency exists when the design or operation of a control does not allow management or personnel to prevent or detect misstatements on a timely basis. A significant deficiency is a deficiency, or combination of deficiencies, less severe than a material weakness yet important enough to merit attention by those charged with governance. A material weakness is a deficiency, or combination of deficiencies, such that there is a reasonable possibility that a material misstatement of the financial statements will not be prevented or detected on a timely basis. The definitions appear in PCAOB AS 2201 and AICPA AU-C 265.
Do private companies have to test internal controls?
Private companies subject to a financial statement audit do not have to have their controls audited under Section 404, but the auditor still has to understand and consider the company’s internal controls under SAS 145. The auditor can either test controls and reduce substantive procedures, or assess control risk at maximum and perform fully substantive procedures. Most private-company audits use a hybrid approach.
What is the relationship between SOC 2 controls and ICFR controls?
SOC 2 controls cover the AICPA Trust Services Criteria (Security, Availability, Processing Integrity, Confidentiality, Privacy) at a service organization. ICFR controls cover the controls over financial reporting at a reporting entity. The two overlap substantially in the IT general controls domain (access, change management, computer operations) and increasingly in the entity-level controls domain. A company holding both a SOC 2 Type 2 and an ICFR audit can often share evidence across the two engagements.
How long does controls testing take in a typical audit?
For a mid-market private-company audit, controls testing typically runs 3 to 6 weeks of audit team effort, performed over interim (Q3 fieldwork) and final (year-end fieldwork) phases. For a public-company integrated audit, controls testing can run 3 to 6 months of effort across interim, year-end, and roll-forward phases.
Can controls testing be performed remotely?
Yes. The pandemic accelerated the shift to remote controls testing, and 2026 audits remain predominantly remote. Walkthroughs are screen-shared, evidence is inspected via secure file transfer, and inquiry is performed by video. Physical observation (for example, counting inventory or observing physical security at a data center) remains on-site.
What sample size does an auditor use for an automated control?
Typically a sample of 1, combined with a test of the relevant IT general controls (access, change management) that ensure the automated control’s configuration cannot change without authorization. The logic: if the configuration is locked down and the ITGCs are effective, one observation establishes that the configuration operates as designed.
What does roll-forward testing mean?
Roll-forward testing is the audit work performed between the date of interim controls testing and the period-end date. The auditor tests a small additional sample of control instances during the roll-forward period to conclude that the control continued to operate effectively through the end of the period. The roll-forward sample is typically 2 to 5 additional instances depending on the length of the gap.
What happens when a control deviation is found in the sample?
The auditor evaluates whether the deviation is an isolated instance or an indication of a broader problem. The audit response can include: expanding the sample, testing a related compensating control, performing root cause analysis with management, and ultimately concluding whether the control is operating effectively. A confirmed control failure typically leads to a control deficiency, significant deficiency, or material weakness assessment, with related substantive procedures performed to address the residual risk.

Bottom line

Internal controls testing under SAS 145 and PCAOB AS 2201 follows a top-down, risk-based approach: entity-level first, then significant accounts, then relevant assertions, then specific controls. Design effectiveness must be concluded before operating effectiveness is tested. Sample sizes scale with frequency from 1 for annual controls to 25 for daily controls, with adjustments for risk and prior-period deviations. The four deficiencies that recur most often (segregation of duties, manual journal entry review, user access, change management) account for the majority of material weakness disclosures and remediation comment letters.

Sources and methodology

Primary sources: AICPA SAS 145, “Understanding the Entity and Its Environment and Assessing the Risks of Material Misstatement” (AU-C Section 315 revised), effective for audits of periods ending on or after December 15, 2023; AICPA AU-C 265, “Communicating Internal Control Related Matters Identified in an Audit”; AICPA AU-C 330, “Performing Audit Procedures in Response to Assessed Risks and Evaluating the Audit Evidence Obtained”; PCAOB AS 2201, “An Audit of Internal Control Over Financial Reporting That Is Integrated with an Audit of Financial Statements”; PCAOB AS 1000, “General Responsibilities of the Auditor in Conducting an Audit,” effective for fiscal years beginning on or after December 15, 2024; COSO 2013, “Internal Control Integrated Framework”; AICPA Audit Guide on Audit Sampling. ACFE 2024 Report to the Nations cited for fraud loss statistics. Sample size conventions reflect AICPA Audit Guide ranges and the methodologies of major US audit firms.