Penetration Testing
Pentesting: summary and checklists
This document summarizes a pragmatic, end‑to‑end penetration testing approach and provides ready‑to‑use checklists. It is aligned with the OWASP Web Security Testing Guide (WSTG v4.2) and common OSCP methodology. Use the checklists to plan, execute, evidence, and report engagements consistently.
How to use this document
- Skim the Table of Contents and pick the scope you are testing (web, API, network, OS).
- Use the methodology as a sequence, and the checklists as quick run sheets.
- Inline links and acronym expansions help non‑specialists follow along.
Table of Contents
- Pentesting: summary and checklists
- How to use this document
- Purpose and scope
- Methodology overview
- Web application testing
- API testing focus
- Network and infrastructure
- Operating system and platform notes (macOS)
- Evidence, safety, and ethics
- Reporting and risk rating
- Deliverables
- Checklists
- Sources
Purpose and scope
- Goal: The primary objective is to identify, validate, and demonstrate the impact of security weaknesses. The engagement must deliver actionable remediation guidance to improve the target's security posture. It's not just about finding flaws, but about showing how they can be fixed.
- Scope: A well-defined scope is critical for a successful engagement. It must clearly delineate targets, techniques, and timelines.
- Targets: Specify what is in-scope (e.g.,
api.example.com
, iOS app v2.1) and what is explicitly out-of-scope (e.g., corporate infrastructure, third-party integrations). - Techniques: Define allowed and disallowed testing methods. For instance, are denial-of-service (DoS) tests permitted? Is social engineering of employees allowed?
- Time Window: Testing should be confined to an agreed-upon timeframe (e.g., "Mon-Fri, 9am-5pm PST") to minimize disruption and align with monitoring capabilities.
- Targets: Specify what is in-scope (e.g.,
- Rules of Engagement (ROE): These are the ground rules for the test.
- Authorization: Obtain explicit, written permission from the asset owner.
- Safe Harbor: Legal protection for the testing team for actions performed within the agreed scope.
- Contact Points: Establish primary points of contact for both the testing team and the client for escalations and emergencies.
- Data Handling: Specify how sensitive data discovered during the test should be handled, stored, and ultimately purged.
Methodology overview
A structured methodology ensures comprehensive and repeatable testing. This process follows a standard lifecycle from planning to validation.
- Pre‑engagement: This foundational phase aligns the testing team and the client. It involves defining the scope, rules of engagement (ROE), provisioning test accounts, setting success criteria, and establishing communication and incident handling plans. Clear expectations are key.
- Reconnaissance: Gathering information about the target.
- Passive: Collecting information from public sources without directly engaging the target (e.g., DNS records, search engine hacking, social media).
- Active: Directly probing the target to discover hosts, open ports, and services (e.g., using tools like
nmap
,gobuster
). The goal is to map the attack surface, enumerate the technology stack, and identify API surfaces.
- Threat Modeling: This phase involves thinking like an attacker. Identify high-value assets (e.g., user data, payment info), analyze trust boundaries (e.g., between a public API and an internal service), and map out potential attacker paths and abuse cases.
- Vulnerability Analysis: With a map of the target, begin searching for flaws. This is a mix of automated scanning (e.g., with Nessus, Burp Suite Pro) and manual testing to find misconfigurations, weak controls (e.g., lack of rate limiting), and insecure defaults.
- Exploitation: The goal here is to safely validate identified vulnerabilities and demonstrate their real-world impact. This must be done with extreme care to minimize the "blast radius" and avoid disrupting production systems. For example, instead of dropping a table, an SQL injection could be proven by extracting the database version.
- Post‑exploitation: After gaining an initial foothold, this phase explores what an attacker could do next. This might involve controlled attempts at lateral movement (moving to other systems) or privilege escalation (gaining higher-level access). All actions must be carefully documented.
- Reporting: A critical deliverable. The report should detail findings, assign risk ratings (e.g., using CVSS), explain the business impact, and provide clear, step-by-step instructions for reproduction and remediation.
- Remediation Validation: After the client has implemented fixes, the tester re-tests the specific vulnerabilities to verify that the mitigations are effective. This closes the feedback loop and confirms risk reduction.
Web application testing (OWASP WSTG‑aligned)
This section aligns with the OWASP Web Security Testing Guide, covering common vulnerability categories.
- Information Gathering: Map the application's attack surface. Enumerate subdomains (
subfinder
,amass
), discover directories and files (gobuster
,dirb
), identify parameters, and map out client-side routes and API endpoints. Analyze third-party scripts for supply chain risks. - Configuration and Deployment Management: Look for misconfigurations. Examples include enabled directory listing, exposed default files (e.g.,
.env
), debug flags left on in production, and verbose error messages that leak internal state. Check for proper implementation of security headers likeContent-Security-Policy
(CSP) andStrict-Transport-Security
. - Identity and Authentication: Test the gates of the application. Does it enforce a strong password policy? Is MFA implemented correctly? Are account recovery flows secure? Can you brute-force logins or stuff stolen credentials without being blocked?
- Authorization: Once authenticated, what can a user do?
- Horizontal access control: Can User A access User B's data by changing an ID in a URL (e.g.,
/orders/123
to/orders/124
)? This is an Insecure Direct Object Reference (IDOR). - Vertical access control: Can a regular user access an admin panel (e.g.,
/admin
)?
- Horizontal access control: Can User A access User B's data by changing an ID in a URL (e.g.,
- Session Management: Test how the application handles sessions. Are cookies flagged as
HttpOnly
andSecure
? Does the session token rotate upon login or privilege change? Does logging out actually invalidate the session on the server? - Input Validation and Injection: Test how the application handles user-supplied data.
- SQL Injection (SQLi): If a query is built like
SELECT * FROM products WHERE id = '
+ product_id +'
, an attacker could use1' OR '1'='1
to dump all products. - Cross-Site Scripting (XSS): If a search term is reflected on the page without encoding, an input like
<script>alert('XSS')</script>
could execute in a user's browser. We test for Reflected, Stored, and DOM-based XSS. - Also test for Server-Side Request Forgery (SSRF), XML External Entity (XXE) injection, and insecure deserialization.
- SQL Injection (SQLi): If a query is built like
- Business Logic: Abuse the intended functionality of the application. Can you manipulate a price in a checkout flow? Can you submit a form multiple times in a race condition to get a double discount?
- Data Protection: Check for sensitive data exposure. Is personally identifiable information (PII) encrypted at rest and in transit? Are API keys or tokens accidentally leaked in URLs, logs, or client-side code?
- Client‑Side Security: Analyze risks that manifest in the browser. This includes DOM-based XSS, insecure JavaScript dependencies (supply chain attacks), and misuse of web storage (e.g., storing sensitive data in
localStorage
).
API testing focus
Modern applications heavily rely on APIs. Testing them requires a focus on their unique characteristics.
- Discovery: Identify all API endpoints and supported methods. This often involves finding OpenAPI/Swagger or GraphQL schema definitions, which can reveal undocumented or hidden endpoints. Watch for different versions, like
/api/v1/users
vs/api/v2/users
. - Authentication and Authorization (AuthN/AuthZ):
- Broken Object Level Authorization (BOLA): The most common API flaw. Similar to IDOR, can an authenticated user access resources they don't own? For example, accessing
/api/v1/users/another_user_id/profile
should be forbidden. - Broken Function Level Authorization (BFLA): Can a non-admin user call an admin-only endpoint, like
POST /api/v1/admin/users
? - Token Security: Are JSON Web Tokens (JWTs) properly validated (checking the signature and expiration)? Are token scopes enforced?
- Broken Object Level Authorization (BOLA): The most common API flaw. Similar to IDOR, can an authenticated user access resources they don't own? For example, accessing
- Input and Serialization:
- Mass Assignment: If an API endpoint automatically binds incoming JSON to object properties, an attacker might be able to overwrite sensitive fields. For example, sending
{"isAdmin": true}
when updating their user profile. - Data Handling: Test for proper content-type validation (e.g., rejecting
application/xml
if onlyapplication/json
is expected) and secure file upload handling.
- Mass Assignment: If an API endpoint automatically binds incoming JSON to object properties, an attacker might be able to overwrite sensitive fields. For example, sending
- Rate‑Limiting and Quotas: Test for the absence of rate-limiting on sensitive functions like login or password reset, which could allow for brute-force attacks. Check if reasonable quotas are enforced to prevent resource exhaustion.
- Data Handling: Look for excessive data exposure, where an API returns more data than the client UI displays. For instance, an endpoint might return a user object with a
passwordHash
field that is simply ignored by the frontend, but available to an attacker.
Network and infrastructure
- External Footprint: Map the perimeter. This includes DNS enumeration (
dig
,dnsrecon
), checking for TLS/SSL misconfigurations (testssl.sh
), and scanning for exposed services (nmap
,masscan
). Look for default credentials on management interfaces (e.g., routers, firewalls), CI/CD platforms, and artifact registries (e.g., Docker Hub, npm). - Internal Network: Once inside, explore the local network. Enumerate hosts and services, and probe for weak protocols like SMB, NFS, and RPC. Test for network segmentation gaps—can a host in a lower-trust zone communicate with a high-trust zone?
- Active Directory (AD): In corporate environments, AD is often a primary target. Look for common misconfigurations, attempt Kerberoasting (extracting service account hashes) or AS-REP Roasting (extracting hashes for users without Kerberos pre-authentication).
- Privilege Escalation (Linux/Windows): Once on a host, the goal is to gain higher privileges. Search for kernel or driver vulnerabilities, misconfigured services running as root/SYSTEM, PATH variable hijacking (on Linux), or DLL hijacking (on Windows). Tools like
LinPEAS
andWinPEAS
can automate enumeration. - Cloud Environments (Conceptual): Cloud security testing focuses on configuration. Look for IAM policies that violate the principle of least privilege, exposed secrets in code or environment variables, publicly accessible storage buckets (e.g., AWS S3), and unprotected instance metadata services.
Operating system and platform notes (macOS)
For engagements involving macOS endpoints, consider the platform's specific security controls.
- Platform Controls: Assess the effectiveness of built-in security features.
- Gatekeeper and Notarization: Ensures that only trusted software runs on the machine. Can it be bypassed?
- System Integrity Protection (SIP): Protects core system files and processes. Is it enabled?
- Transparency, Consent, and Control (TCC): Manages permissions for apps to access sensitive user data (e.g., location, contacts). Are there ways to trick a user into granting excessive permissions?
- Hardening: Verify standard security hygiene.
- Software Sources: Is the system configured to only allow apps from the App Store and identified developers?
- FileVault: Is full-disk encryption enabled?
- Firewall: Is the application firewall enabled and configured correctly?
- Login Items / Launch Agents: Review for persistence mechanisms that could be used by malware.
- Telemetry and Logging: macOS has a Unified Logging system. Ensure that sensitive data (e.g., passwords, API keys) is not inadvertently being written to logs where it could be collected by an attacker or endpoint security agent.
Evidence, safety, and ethics
Professional penetration testing requires a strong ethical framework and safety controls to prevent unintended harm.
- Data Minimization: Collect only the minimum evidence necessary to prove a vulnerability's impact. For example, query for a user count (
SELECT COUNT(id) FROM users
) instead of dumping the entire users table. Sanitize personally identifiable information (PII) from screenshots and logs wherever possible. - Chain of Custody: Maintain a clear record of your actions. Keep timestamped notes, take hashes of any downloaded artifacts, and clearly map your actions to the findings in your report. This ensures the integrity of your evidence.
- Safety Controls: Adhere strictly to the agreed-upon scope and rules of engagement. Use non-destructive payloads (e.g.,
id
orhostname
commands for OS injection instead ofrm -rf /
). Obtain explicit written approval before conducting any potentially disruptive tests, such as those for Denial of Service. - Cleanup: At the end of the engagement, be a good guest. Remove any test accounts, backdoors, or payloads you introduced. Revert any configuration changes to their original state. Provide a checklist of cleanup actions to the system owners.
Reporting and risk rating
The report is the primary deliverable of the engagement and must be clear, actionable, and professional.
- Content: A standard report includes an executive summary (high-level overview for management), a detailed breakdown of the scope and methodology, and then the core findings. Each finding should include:
- A description of the vulnerability and affected assets.
- Reproducible, step-by-step instructions.
- Supporting evidence (e.g., screenshots, code snippets).
- An assessment of the impact and likelihood.
- A final risk rating.
- Actionable remediation guidance.
- Risk Rating: Use a consistent and defensible rating scheme, like the Common Vulnerability Scoring System (CVSS) v3.1. A rating might look like:
8.8 (High) - CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H
. Clearly document any assumptions made during the rating process. - Remediation: Focus on practical advice. Prioritize high-impact, systemic fixes. Propose defense-in-depth controls (e.g., not just fixing the XSS, but also implementing a strong CSP) and suggest opportunities for improved detection and monitoring.
Deliverables
- Formal Report: A human-readable report (typically PDF) with a sanitized version for wider distribution if needed.
- Technical Annex: An appendix with detailed reproduction steps, proof-of-concept scripts, and environment notes that may be too verbose for the main report.
- Evidence Archive: A securely shared archive (e.g., encrypted zip file) containing all evidence, with hashes to ensure integrity.
- Remediation Validation Memo: A short document confirming the results of the re-test after fixes have been applied.
Checklists
Pre‑engagement
- Defined scope, ROE, test window, and contacts
- Written authorization and safe‑harbor language
- Test data and seeded accounts provisioned; MFA paths defined
- Logging and detection teams informed with expected activity patterns
- Data handling requirements (PII, secrets, retention) agreed
- Communications cadence, incident handling, and escalation mapped
Web application (WSTG mapping)
- Enumerated hosts, routes, parameters, and client‑side assets
- Baseline security headers and TLS posture evaluated
- Authn: MFA, password policy, recovery flows, lockout/ratelimiting
- Authz: IDOR/horizontal/vertical checks across key objects
- Sessions: cookie flags, rotation, invalidation, fixation
- Injection: SQL/NoSQL/OS/LDAP/XXE/template/deserialization tested
- XSS: reflected/stored/DOM; CSP effectiveness and bypasses
- Business logic: workflow abuse, race conditions, replay
- Sensitive data exposure and secrets in code, repos, or responses
- File upload and serialization handling with content‑type validation
API
- Collected and validated OpenAPI/GraphQL schemas
- Checked BOLA/BFLA, object filtering, field‑level authorization
- Verified token scopes, audience/issuer validation, key rotation
- Enforced rate limits/quotas per token/user/IP/resource
- Mass assignment/type confusion and pagination boundary tests
Network and internal
- External recon: DNS/TLS, exposed services, management interfaces
- Internal enum: SMB/NFS/RPC/LDAP/WinRM; weak segmentation paths
- Credential hygiene: default creds, password reuse, secrets in shares
- AD: roasting/delegation/ACL misconfigs and lateral paths
Privilege escalation and post‑exploitation
- Enumerated local privilege escalation vectors (Linux/Windows)
- Verified least‑privilege for services/agents and scheduled tasks
- Collected minimal evidences; documented commands and timestamps
- Cleaned up payloads, reverted configs, removed accounts/keys
Reporting and validation
- Findings have clear repro, impact, and asset mapping
- Risk ratings consistent; assumptions documented
- Actionable remediation and detection recommendations provided
- Re‑test plan agreed; validation results captured and signed off
Sources
- OWASP WSTG v4.2: [https://owasp.org/www-project-web-security-testing-guide/]
- OWASP API Security Top‑10: [https://owasp.org/API-Security/]
- CVSS v3.1: [https://www.first.org/cvss/]